首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A novel tool for computer-aided design of single-site mutations in proteins and peptides is presented. It proceeds by performing in silico all possible point mutations in a given protein or protein region and estimating the stability changes with linear combinations of database-derived potentials, whose coefficients depend on the solvent accessibility of the mutated residues. Upon completion, it yields a list of the most stabilizing, destabilizing or neutral mutations. This tool is applied to mouse, hamster and human prion proteins to identify the point mutations that are the most likely to stabilize their cellular form. The selected mutations are essentially located in the second helix, which presents an intrinsic preference to form beta-structures, with the best mutations being T183-->F, T192-->A and Q186-->A. The T183 mutation is predicted to be by far the most stabilizing one, but should be considered with care as it blocks the glycosylation of N181 and this blockade is known to favor the cellular to scrapie conversion. Furthermore, following the hypothesis that the first helix might induce the formation of hydrophilic beta-aggregates, several mutations that are neutral with respect to the structure's stability but improve the helix hydrophobicity are selected, among which is E146-->L. These mutations are intended as good candidates to undergo experimental tests.  相似文献   

2.
We have used random oligonucleotide mutagenesis (or saturation mutagenesis) to create a library of point mutations in the alpha 1 protein domain of a Major Histocompatibility Complex (MHC) molecule. This protein domain is critical for T cell and B cell recognition. We altered the MHC class I H-2DP gene sequence such that synthetic mutant alpha 1 exons (270 bp of coding sequence), which contain mutations identified by sequence analysis, can replace the wild type alpha 1 exon. The synthetic exons were constructed from twelve overlapping oligonucleotides which contained an average of 1.3 random point mutations per intact exon. DNA sequence analysis of mutant alpha 1 exons has shown a point mutant distribution that fits a Poisson distribution, and thus emphasizes the utility of this mutagenesis technique to "scan" a large protein sequence for important mutations. We report our use of saturation mutagenesis to scan an entire exon of the H-2DP gene, a cassette strategy to replace the wild type alpha 1 exon with individual mutant alpha 1 exons, and analysis of mutant molecules expressed on the surface of transfected mouse L cells.  相似文献   

3.
MOTIVATION: Human single nucleotide polymorphisms (SNPs) are the most frequent type of genetic variation in human population. One of the most important goals of SNP projects is to understand which human genotype variations are related to Mendelian and complex diseases. Great interest is focused on non-synonymous coding SNPs (nsSNPs) that are responsible of protein single point mutation. nsSNPs can be neutral or disease associated. It is known that the mutation of only one residue in a protein sequence can be related to a number of pathological conditions of dramatic social impact such as Alzheimer's, Parkinson's and Creutzfeldt-Jakob's diseases. The quality and completeness of presently available SNPs databases allows the application of machine learning techniques to predict the insurgence of human diseases due to single point protein mutation starting from the protein sequence. RESULTS: In this paper, we develop a method based on support vector machines (SVMs) that starting from the protein sequence information can predict whether a new phenotype derived from a nsSNP can be related to a genetic disease in humans. Using a dataset of 21 185 single point mutations, 61% of which are disease-related, out of 3587 proteins, we show that our predictor can reach more than 74% accuracy in the specific task of predicting whether a single point mutation can be disease related or not. Our method, although based on less information, outperforms other web-available predictors implementing different approaches. AVAILABILITY: A beta version of the web tool is available at http://gpcr.biocomp.unibo.it/cgi/predictors/PhD-SNP/PhD-SNP.cgi  相似文献   

4.
Sequence alignment by cross-correlation.   总被引:1,自引:0,他引:1  
Many recent advances in biology and medicine have resulted from DNA sequence alignment algorithms and technology. Traditional approaches for the matching of DNA sequences are based either on global alignment schemes or heuristic schemes that seek to approximate global alignment algorithms while providing higher computational efficiency. This report describes an approach using the mathematical operation of cross-correlation to compare sequences. It can be implemented using the fast fourier transform for computational efficiency. The algorithm is summarized and sample applications are given. These include gene sequence alignment in long stretches of genomic DNA, finding sequence similarity in distantly related organisms, demonstrating sequence similarity in the presence of massive (approximately 90%) random point mutations, comparing sequences related by internal rearrangements (tandem repeats) within a gene, and investigating fusion proteins. Application to RNA and protein sequence alignment is also discussed. The method is efficient, sensitive, and robust, being able to find sequence similarities where other alignment algorithms may perform poorly.  相似文献   

5.
A L Lu  I C Hsu 《Genomics》1992,14(2):249-255
A novel method for identifying DNA point mutations has been developed by using mismatch repair enzymes. The high specificity of the Escherichia coli MutY protein has permitted the development of a reliable and sensitive method for the detection and characterization of point mutations in the human genome. The MutY protein is involved in a repair pathway that can convert A/G or A/C mismatches to C/G or G/C basepairs, respectively. A/G or A/C mismatches formed by hybridization between two amplified genomic DNA samples or between specific DNA probes and target DNA are nicked at the mispaired adenine strand by MutY protein. As little as 1% of the mutant sequence can be detected by the mismatch repair enzyme cleavage (MREC) method in a mixture of normal and mutated DNAs (e.g., mutant cells are only present in 1% of the normal cell background). By using different probes, the assay also can determine the nucleotide sequence of the mutation. We have applied this method to detect single-base substitutions in human oncogenes.  相似文献   

6.
How are model protein structures distributed in sequence space?   总被引:6,自引:0,他引:6       下载免费PDF全文
The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.  相似文献   

7.
In RNA fitness landscapes with interconnected networks of neutral mutations, neutral precursor mutations can play an important role in facilitating the accessibility of epistatic adaptive mutant combinations. I use an exhaustively surveyed fitness landscape model based on short sequence RNA genotypes (and their secondary structure phenotypes) to calculate the minimum rate at which mutants initially appearing as neutral are incorporated into an adaptive evolutionary walk. I show first, that incorporating neutral mutations significantly increases the number of point mutations in a given evolutionary walk when compared to estimates from previous adaptive walk models. Second, that incorporating neutral mutants into such a walk significantly increases the final fitness encountered on that walk - indeed evolutionary walks including neutral steps often reach the global optimum in this model. Third, and perhaps most importantly, evolutionary paths of this kind are often extremely winding in their nature and have the potential to undergo multiple mutations at a given sequence position within a single walk; the potential of these winding paths to mislead phylogenetic reconstruction is briefly considered.  相似文献   

8.
The distribution of fitness effects (DFEs) of new mutations across different environments quantifies the potential for adaptation in a given environment and its cost in others. So far, results regarding the cost of adaptation across environments have been mixed, and most studies have sampled random mutations across different genes. Here, we quantify systematically how costs of adaptation vary along a large stretch of protein sequence by studying the distribution of fitness effects of the same ≈2,300 amino-acid changing mutations obtained from deep mutational scanning of 119 amino acids in the middle domain of the heat shock protein Hsp90 in five environments. This region is known to be important for client binding, stabilization of the Hsp90 dimer, stabilization of the N-terminal-Middle and Middle-C-terminal interdomains, and regulation of ATPase–chaperone activity. Interestingly, we find that fitness correlates well across diverse stressful environments, with the exception of one environment, diamide. Consistent with this result, we find little cost of adaptation; on average only one in seven beneficial mutations is deleterious in another environment. We identify a hotspot of beneficial mutations in a region of the protein that is located within an allosteric center. The identified protein regions that are enriched in beneficial, deleterious, and costly mutations coincide with residues that are involved in the stabilization of Hsp90 interdomains and stabilization of client-binding interfaces, or residues that are involved in ATPase–chaperone activity of Hsp90. Thus, our study yields information regarding the role and adaptive potential of a protein sequence that complements and extends known structural information.  相似文献   

9.
Ribose-binding protein (RBP) is exported to the periplasm of Escherichia coli via the general export pathway. An rbsB-lacZ gene fusion was constructed and used to select mutants defective in RBP export. The spontaneous Lac+ mutants isolated in this selection contained either single-amino-acid substitutions or a deletion of the RBP signal sequence. Intact rbsB genes containing eight different point mutations in the signal sequence were reconstructed, and the effects of the mutations on RBP export were examined. Most of the mutations caused severe defects in RBP export. In addition, different suppressor mutations in SecY/PrlA protein were analyzed for their effects on the export of RBP signal sequence mutants in the presence or absence of SecB. Several RBP signal sequence mutants were efficiently suppressed, but others were not suppressed. Export of an RBP signal sequence mutant in prlA mutant strains was partially dependent on SecB, which is in contrast to the SecB independence of wild-type RBP export. However, the kinetics of export of an RBP signal sequence mutant point to a rapid loss of pre-RBP export competence, which occurs in strains containing or lacking SecB. These results suggest that SecB does not stabilize the export-competent conformation of RBP and may affect translocation by stabilizing the binding of pre-RBP at the translocation site.  相似文献   

10.
M. S. Ciampi  J. R. Roth 《Genetics》1988,118(2):193-202
A single site in the middle of the coding sequence of the hisG gene of Salmonella is required for most of the polar effect of mutations in this gene. Nonsense and insertion mutations mapping upstream of this point in the hisG gene all have strong polar effects on expression of downstream genes in the operon; mutations mapping promotor distal to this site have little or no polar effect. Two previously known hisG mutations, mapping in the region of the polarity site, abolish the polarity effect of insertion mutations mapping upstream of this region. New polarity site mutations have been selected which have lost the polar effect of upstream nonsense mutations. All mutations abolishing the function of the site are small deletions; three are identical, 28-bp deletions which have arisen independently. A fourth mutation is a deletion of 16 base pairs internal to the larger deletion. Several point mutations within this 16-bp region have no effect on the function of the polarity site. We believe that a small number of polarity sites of this type are responsible for polarity in all genes. The site in the hisG gene is more easily detected than most because it appears to be the only such site in the hisG gene and because it maps in the center of the coding sequence.  相似文献   

11.
A computer program for the generation and analysis of in silico random point mutagenesis libraries is described. The program operates by mutagenizing an input nucleic acid sequence according to mutation parameters specified by the user for each sequence position and type of point mutation. The program can mimic almost any type of random mutagenesis library, including those produced via error-prone PCR (ep-PCR), mutator Escherichia coli strains, chemical mutagenesis, and doped or random oligonucleotide synthesis. The program analyzes the generated nucleic acid sequences and/or the associated protein library to produce several estimates of library diversity (number of unique sequences, point mutations, and single point mutants) and the rate of saturation of these diversities during experimental screening or selection of clones. This information allows one to select the optimal screen size for a given mutagenesis library, necessary to efficiently obtain a certain coverage of the sequence-space. The program also reports the abundance of each specific protein mutation at each sequence position, which is useful as a measure of the level and type of mutation bias in the library. Alternatively, one can use the program to evaluate the relative merits of preexisting libraries, or to examine various hypothetical mutation schemes to determine the optimal method for creating a library that serves the screen/selection of interest. Simulated libraries of at least 109 sequences are accessible by the numerical algorithm with currently available personal computers; an analytical algorithm is also available which can rapidly calculate a subset of the numerical statistics in libraries of arbitrarily large size. A multi-type double-strand stochastic model of ep-PCR is developed in an appendix to demonstrate the applicability of the algorithm to amplifying mutagenesis procedures. Estimators of DNA polymerase mutation-type-specific error rates are derived using the model. Analyses of an alpha-synuclein ep-PCR library and NNS synthetic oligonucleotide libraries are given as examples.  相似文献   

12.
Procedures to introduce point mutations, restriction sites and insert or delete DNA fragments are very important tools to study protein function. We describe here two-step PCR-based method for generating single or multiple mutations, insertions and delections in a small region of the sequence. In the first step, a unique restriction site is introduced near the part of DNA sequence to be changed, without changing the amino acid sequence. For this step, one of the methods already described can be used. In the second step, mutations are introduced using mutagenic primers containing the unique restriction site from the first step at the 5′ end, paired with a universal primer crossing another unique restriction site present originally in the sequence. The method is very simple, economic and rapid. In comparison with the traditionalin vitro mutagenesis methods, one can generate large numbers of mutated plasmids in hours.  相似文献   

13.
Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency is the most frequent inborn error of metabolism, and accounts for 90-95% of CAH cases. The affected enzyme, P450C21, is encoded by the CYP21A2 gene, located together with a 98% nucleotide sequence identity CYP21A1P pseudogene, on chromosome 6p21.3. Even though most patients carry CYP21A1P-derived mutations, an increasing number of novel and rare mutations in disease causing alleles were found in the last years. In the present work, we describe five CYP21A2 novel mutations, p.R132C, p.149C, p.M283V, p.E431K and a frameshift g.2511_2512delGG, in four non-classical and one salt wasting patients from Argentina. All novel point mutations are located in CYP21 protein residues that are conserved throughout mammalian species, and none of them were found in control individuals. The putative pathogenic mechanisms of the novel variants were analyzed in silico. A three-dimensional CYP21 structure was generated by homology modeling and the protein design algorithm FoldX was used to calculate changes in stability of CYP21A2 protein. Our analysis revealed changes in protein stability or in the surface charge of the mutant enzymes, which could be related to the clinical manifestation found in patients.  相似文献   

14.
A Monte Carlo simulation based sequence design method is proposed to investigate the role of site-directed point mutations in protein misfolding. Site-directed point mutations are incorporated in the designed sequences of selected proteins. While most mutated sequences correctly fold to their native conformation, some of them stabilize in other nonnative conformations and thus misfold/unfold. The results suggest that a critical number of hydrophobic amino acid residues must be present in the core of the correctly folded proteins, whereas proteins misfold/unfold if this number of hydrophobic residues falls below the critical limit. A protein can accommodate only a particular number of hydrophobic residues at the surface, provided a large number of hydrophilic residues are present at the surface and critical hydrophobicity of the core is preserved. Some surface sites are observed to be equally sensitive toward site-directed point mutations as the core sites. Point mutations with highly polar and charged amino acids increases the misfold/unfold propensity of proteins. Substitution of natural amino acids at sites with different number of nonbonded contacts suggests that both amino acid identity and its respective site-specificity determine the stability of a protein. A clash-match method is developed to calculate the number of matching and clashing interactions in the mutated protein sequences. While misfolded/unfolded sequences have a higher number of clashing and a lower number of matching interactions, the correctly folded sequences have a lower number of clashing and a higher number of matching interactions. These results are valid for different SCOP classes of proteins.  相似文献   

15.
We have analyzed what phylogenetic signal can be derived by small subunit rRNA comparison for bacteria of different but closely related genera (enterobacteria) and for different species or strains within a single genus (Escherichia or Salmonella), and finally how similar are the ribosomal operons within a single organism (Escherichia coli). These sequences have been analyzed by neighbor-joining, maximum likelihood, and parsimony. The robustness of each topology was assessed by bootstrap. Sequences were obtained for the seven rrn operons of E. coli strain PK3. These data demonstrated differences located in three highly variable domains. Their nature and localization suggest that since the divergence of E. coli and Salmonella typhimurium, most point mutations that occurred within each gene have been propagated among the gene family by conversions involving short domains, and that homogenization by conversions may not have affected the entire sequence of each gene. We show that the differences that exist between the different operons are ignored when sequences are obtained either after cloning of a single operon or directly from polymerase chain reaction (PCR) products. Direct sequencing of PCR products produces a mean sequence in which mutations present in the most variable domains become hidden. Cloning a single operon results in a sequence that differs from that of the other operons and of the mean sequence by several point mutations. For identification of unknown bacteria at the species level or below, a mean sequence or the sequence of a single nonidentified operon should therefore be avoided. Taking into account the seven operons and therefore mutations that accumulate in the most variable domains would perhaps increase tree resolution. However, if gene conversions that homogenize the rRNA multigene family are rare events, some nodes in phylogenetic trees will reflect these recombination events and these trees may therefore be gene trees rather than organismal trees.   相似文献   

16.

Background  

RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(n m ) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure.  相似文献   

17.
Capriotti E  Compiani M 《Proteins》2006,64(1):198-209
In this article we use mutation studies as a benchmark for a minimal model of the folding process of helical proteins. The model ascribes a pivotal role to the collisional dynamics of a few crucial residues (foldons) and predicts the folding rates by exploiting information drawn from the protein sequence. We show that our model rationalizes the effects of point mutations on the kinetics of folding. The folding times of two proteins and their mutants are predicted. Stability and location of foldons have a critical role as the determinants of protein folding. This allows us to elucidate two main mechanisms for the kinetic effects of mutations. First, it turns out that the mutations eliciting the most notable effects alter protein stability through stabilization or destabilization of the foldons. Secondly, the folding rate is affected via a modification of the foldon topology by those mutations that lead to the birth or death of foldons. The few mispredicted folding rates of some mutants hint at the limits of the current version of the folding model proposed in the present article. The performance of our folding model declines in case the mutated residues are subject to strong long-range forces. That foldons are the critical targets of mutation studies has notable implications for design strategies and is of particular interest to address the issue of the kinetic regulation of single proteins in the general context of the overall dynamics of the interactome.  相似文献   

18.
The karyophilic protein N1 (590 amino acids) is an abundant soluble protein of the nuclei of Xenopus laevis oocytes where it forms defined complexes with histones H3 and H4. The amino acid sequence of this protein, as deduced from the cDNA, reveals a putative nuclear targeting signal as well as two acidic domains which are candidates for the interaction with histones. Using two different histone binding assays in vitro we have found that the deletion of the larger acidic domain reduces histone binding drastically to a residual value of approximately 15% of the complete molecule, whereas removal of the smaller acidic domain only slightly reduces histone complex formation in solution, but infers more effectively with binding to immobilized histones. In the primary structure of the protein both histone-binding domains are distant from the conspicuous nuclear accumulation signal sequence (residues 531-537) close to the carboxy terminus which is very similar to the SV40 large T-antigen nuclear targeting sequence. Using a series of N1 mutants altered by deletions or point mutations we show that this signal is required but not sufficient for nuclear accumulation of protein N1. The presence of an additional, more distantly related signal sequence in position 544-554 is also needed to achieve a level of nuclear uptake equivalent to that of the wild-type protein. Results obtained with point mutations support the concept of two nuclear targeting sequences and emphasize the importance of specific lysine and arginine residues in these signal sequences.  相似文献   

19.

One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

  相似文献   

20.
The Saccharomyces cerevisiae F1-ATPase beta subunit precursor contains redundant mitochondrial protein import information at its NH2 terminus (D. M. Bedwell, D. J. Klionsky, and S. D. Emr, Mol. Cell. Biol. 7:4038-4047, 1987). To define the critical sequence and structural features contained within this topogenic signal, one of the redundant regions (representing a minimal targeting sequence) was subjected to saturation cassette mutagenesis. Each of 97 different mutant oligonucleotide isolates containing single (32 isolates), double (45 isolates), or triple (20 isolates) point mutations was inserted in front of a beta-subunit gene lacking the coding sequence for its normal import signal (codons 1 through 34 were deleted). The phenotypic and biochemical consequences of these mutations were then evaluated in a yeast strain deleted for its normal beta-subunit gene (delta atp2). Consistent with the lack of an obvious consensus sequence for mitochondrial protein import signals, many mutations occurring throughout the minimal targeting sequence did not significantly affect its import competence. However, some mutations did result in severe import defects. In these mutants, beta-subunit precursor accumulated in the cytoplasm, and the yeast cells exhibited a respiration defective phenotype. Although point mutations have previously been identified that block mitochondrial protein import in vitro, a subset of the mutations reported here represents the first single missense mutations that have been demonstrated to significantly block mitochondrial protein import in vivo. The previous lack of such mutations in the beta-subunit precursor apparently relates to the presence of redundant import information in this import signal. Together, our mutants define a set of constraints that appear to be critical for normal activity of this (and possibly other) import signals. These include the following: (i) mutant signals that exhibit a hydrophobic moment greater than 5.5 for the predicted amphiphilic alpha-helical conformation of this sequence direct near normal levels of beta-subunit import (ii) at least two basic residues are necessary for efficient signal function, (iii) acidic amino acids actively interfere with import competence, and (iv) helix-destabilizing residues also interfere with signal function. These experimental observations provide support for mitochondrial protein import models in which both the structure and charge of the import signal play a critical role in directing mitochondrial protein targeting and import.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号