首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Wu G  Yan S 《Protein engineering》2003,16(3):195-199
In this data-based theoretical analysis, we use the random approach to analyse the amino acid pairs in human beta-glucocerebrosidase in order to determine which amino acid pairs are more sensitive to 109 variants from missense mutant human glucocerebrosidase. The rationale of this study is based on our hypothesis and findings that the harmful variants are more likely to occur at randomly unpredictable amino acid pairs and the non-harmful variants are more likely to occur at randomly predictable amino acid pairs. This is because we argue that the randomly predictable amino acid pairs should not be deliberately evolved, whereas the randomly unpredictable amino acid pairs should be deliberately evolved with connection of protein function. The results show, for example, that 93.58% of 109 variants occur at randomly unpredictable amino acid pairs, which account for 71.40% of amino acid pairs in glucocerebrosidase, and the chance of occurrence of the variant is about 4.4 times higher in randomly unpredictable amino acid pairs than in predictable pairs. Hence the randomly unpredictable amino acid pairs are more sensitive to variants in human glucocerebrosidase. The results also suggest that human glucocerebrosidase has a natural tendency to variants.  相似文献   

2.
Wu G  Yan S 《Peptides》2003,24(3):347-352
In this data-based theoretical analysis, we use the random approach to analyze the amino acid pairs in 5(IV) chain precursor (CA54) in order to determine which amino acid pairs are more sensitive to 151 variants from missense mutant human CA54 protein. The rationale of this study is based on our hypothesis and previous findings that harmful variance is more likely to occur at randomly unpredictable amino acid pair position rather than at randomly predictable positions. This is reasonable to argue as randomly predictable amino acid pairs are less likely to be deliberately evolved, whereas randomly unpredictable amino acid pairs are probably deliberately evolved in connection with protein function. The results show that all 151 variants occurred at randomly unpredictable amino acid pairs and the chance of a variant occurring is markedly higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, randomly unpredictable amino acid pairs are more sensitive to variance in human CA54. The results also suggest that the human CA54 protein has a natural tendency towards variants.  相似文献   

3.
This is the continuation of our studies using random approaches to analyse the p53 protein family. In this data-based theoretical analysis, we use the random approach to analyse the amino acid pairs in human p53 protein in order to determine which amino acid pairs are more sensitive to 190 human p53 mutations/variants. The rationale of this study is based on our hypothesis and findings that a harmful mutation is more likely to occur at randomly unpredictable amino acid pairs, and a harmless mutation is more likely to occur at randomly predictable amino acid pairs. This is because we argue that the randomly predictable amino acid pairs should not be deliberately evolved, whereas the randomly unpredictable amino acid pairs should be deliberately evolved with a connection to protein function. The results show, for example, that 93.16% of 190 mutations/variants occur at randomly unpredictable amino acid pairs. Thus, the randomly unpredictable amino acid pairs are more sensitive to mutations/variants in human p53 protein. The results also suggest that the human p53 protein has a tendency for the occurrence of mutation/variants.  相似文献   

4.
In this study, we use our probabilistic approach to analyze the amino acid pairs in human copper-transporting ATPase 2 (ATP7B) in order to determine which amino acid pairs are more sensitive to 125 variants from missense mutant human ATP7B. The results show 97.6% of 125 variants occur at randomly unpredictable amino acid pairs, which account for 80.9% of amino acid pairs in ATP7B, and the chance of occurring of variant is about 9 times higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, the randomly unpredictable amino acid pairs are more sensitive to variants in human ATP7B.  相似文献   

5.
In this study, we analyze the amino acid pairs in human protein C precursor to determine which amino acid pairs are more susceptible to 71 variants from missense mutant human protein C precursor. The results show 85.92% of 71 variants occur at randomly unpredictable amino acid pairs accounting for 61.96% of amino acid pairs in protein C.  相似文献   

6.
Over last several years, we demonstrated that the mutations are more likely to occur at randomly unpredictable amino acid pairs in a protein. We therefore can in principle predict the amino acid pairs sensitive to the future mutations in a protein. However, we still need to predict the positions at which the sensitive amino acid pairs are located in a protein. In this study, we use a probabilistic approach to analyze the effect of 191 mutations in human p53 protein and can approximately estimate the sensitive positions to mutations in human p53 protein.  相似文献   

7.
In this study, we used a random approach to determine which amino acid pairs in human coagulation factor IX precursor are more sensitive to its 99 variants. The results show that the randomly unpredictable amino acid pairs are more sensitive to variants.  相似文献   

8.
Wu G  Yan S 《Peptides》2003,24(12):1837-1845
In this study, we analyzed the amino acid pairs affected by mutations in two spike proteins from human coronavirus strains 229E and OC43 by means of random analysis in order to gain some insight into the possible mutations in the spike protein from SARS-CoV. The results demonstrate that the randomly unpredictable amino acid pairs are more sensitive to the mutations. The larger is the difference between actual and predicted frequencies, the higher is the chance of mutation occurring. The effect induced by mutations is to reduce the difference between actual and predicted frequencies. The amino acid pairs whose actual frequencies are larger than their predicted frequencies are more likely to be targeted by mutations, whereas the amino acid pairs whose actual frequencies are smaller than their predicted frequencies are more likely to be formed after mutations. These findings are identical to our several recent studies, i.e. the mutations represent a process of degeneration inducing human diseases.  相似文献   

9.
In this study, we use the random principle to analyse the distributions of amino acids and amino acid pairs in human tumour necrosis factor precursor (TNF-!) and its eight mutations, to compare the measured distribution probability with the theoretical distribution probability and to rank the measured distribution probability against the theoretical distribution probability. In this way, we can suggest that distributions with a high random rank should not be deliberately evolved and conserved and those with a low random rank should be deliberately evolved and conserved in human TNF-!. An increased distribution probability in a mutation means probabilistically that the mutation is more likely to occur spontaneously, whereas a decreased distribution probability in a mutation means probabilistically that the mutation is less likely to occur spontaneously and perhaps is more related to a certain cause. The results, for example, show that the distributions of 30% of the amino acids are identical with their probabilistic simplest distributions, and the distributions of some of the remaining amino acids are very close to their probabilistic simplest distributions. With respect to probabilities of distributions of amino acids in mutations, the results show that mutations lead to an increase in eight probabilities, which are thus more likely to occur. Eight probabilities decrease and are thus less likely to occur. With respect to the random ranks against the theoretical probabilities of distributions of amino acids, the results show that mutations lead to an increase in seven and a decrease in seven probabilities, with two probabilities unchanged.  相似文献   

10.
The repeated amino-acid sequences in Citrobacter Freundii beta-lactamase may be indispensable for its function, because such repetitions cannot be simply attributed to a chance. In order to fully explore the functional units in Citrobacter Freundii beta-lactamase, it may need to analyse all the amino acid pairs, triplets, etc. along Citrobacter Freundii beta-lactamase from one terminal to the other terminal, to count their frequencies and calculate their probabilities. The amino-acid sequence of Citrobacter Freundii beta-lactamase was counted according to two-, three- and four-amino-acid sequences. The counted frequency and probability were compared with the predicted frequency and probability. The amino acid sequences, which appear in Citrobacter Freundii beta-lactamase and can be predicted from its amino acid composition according to a purely random mechanism, should not be deliberately evolved and conserved. By contrast, the amino acid sequences, which appear in Citrobacter Freundii beta-lactamase but cannot be predicted from its amino acid composition according to a purely random mechanism, should be deliberately evolved and conversed. Accordingly 99 (26.053%) and 33 (8.684%) of 380 two-amino-acid sequences can be predicted by the frequency and probability according to a purely random mechanism. Some kinds of amino acid sequences, which absent in Citrobacter Freundii beta-lactamase and can be predicted from its amino acid composition according to a purely random mechanism, should not be deliberately excluded from Citrobacter Freundii beta-lactamase. By contrast, some kinds of amino acid sequences, which absent in Citrobacter Freundii beta-lactamase and cannot be predicted from its amino acid composition according to a purely random mechanism, should be deliberately excluded from Citrobacter Freundii beta-lactamase. Accordingly 89 (48.370%) and 41 (22.283%) of 184 kinds of absent two-amino-acid sequences can be predicted by the frequency and probability according to a purely random mechanism, and 7236 (99.848%) of 7247 kinds of absent three-amino-acid sequences can be predicted by the frequency according to a purely random mechanism. The amino acids, whose probabilities in following certain preceding amino acids can be predicted from Citrobacter Freundii beta-lactamase amino acid composition according to a purely random mechanism, should not be deliberately evolved and conversed, accordingly 2 (0.526%) of 380 counted first order Markov transition probabilities for the second amino acid in two-amino-acid sequences match the predicted conditional probabilities.  相似文献   

11.
Two species of folate binding protein (FBP), an integral membrane-associated form and a soluble secreted form, have been previously purified from cultured human KB cells. The complete nucleotide sequence of the complementary DNA (cDNA) clone for the coding region of the mature membrane-associated FBP has now been determined, and the deduced amino acid sequence has been computer-analyzed for a prediction of the secondary structure of the protein. The clone has 857 nucleotides of which 678 comprise the coding region for 226 amino acids. The deduced amino sequence contains the identical sequence of the published 18 NH2-terminal amino acids of the purified FBP from KB cells and the published partial amino acid sequence of the human milk FBP except for 1 residue. There was also over 90% homology with the published amino acid sequence of the bovine milk FBP. A total of 16 cysteine residues has been conserved in the three proteins indicating that this amino acid may provide a tertiary structure which is required for its ligand binding function. Northern blot analysis using the cDNA probe identified a single band of 1.28-kilobase pair mRNA in KB cells which was 4.7-fold more intense in folate-depleted cells than in normal cells. These results indicate that the membrane FBP and the soluble FBP in the medium are translation products of the same gene. Computer analysis of the deduced amino acid sequence indicates that there is only one stretch of amino acids of sufficient hydrophobicity and length to span the lipid bilayer of the plasma membrane, but it lacked a predictable helical structure. Those regions of the sequence which did have a predictable helical structure lacked sufficient hydrophobicity required for a membrane anchor. Thus, it is likely that the fatty acids previously reported to be present in the membrane-associated FBP from these cells rather than a peptide sequence provide an important membrane anchoring function.  相似文献   

12.
Xi T  Jones IM  Mohrenweiser HW 《Genomics》2004,83(6):970-979
Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant from Tolerant (SIFT) classified 226 of 508 variants (44%) as "Intolerant." Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as "Probably or possibly damaging." Another 9-15% of the variants were classed as "Potentially intolerant or damaging." The results from the two algorithms are highly associated, with concordance in predicted impact observed for approximately 62% of the variants. Twenty-one to thirty-one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as "Tolerant" or "Benign." Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.  相似文献   

13.
14.
Genomic DNA sequence for human C-reactive protein   总被引:12,自引:0,他引:12  
The gene for the prototype acute phase reactant, C-reactive protein, has been isolated from two lambda phage libraries containing inserted human DNA fragments using synthetic oligonucleotide probes. Nucleotide sequence analysis indicates that after coding for a signal peptide of 18 amino acids and the first two amino acids of the mature protein, there is an intron of 278 base pairs followed by the nucleotide sequence for the remaining 204 amino acids. The intron is unusual in that it contains on the positive strand a poly(A) stretch 16 nucleotides long and a poly(GT) region 30 nucleotides long which could adopt the Z-form of DNA. The nucleotide sequence reported here confirms the amino acid sequence of mature C-reactive protein as originally reported except that it codes for an additional 19 amino acids beginning at position 62. Thus DNA sequence analysis predicts that the mature protein consists of 206 amino acids rather than 187 as originally reported. The mRNA cap site is located 104 nucleotides from the start of the signal peptide and there is a 3' noncoding region 1.2 kilobase pairs in length. The gene has a typical promoter containing the sequences TATAAAT and CAAT 29 and 81 base pairs upstream, respectively, of the cap site.  相似文献   

15.
Recent advances in DNA sequencing techniques have identified rare single‐nucleotide variants with less than 1% minor allele frequency. Despite the growing interest and physiological importance of rare variants in genome sciences, less attention has been paid to the allele frequency of variants in protein sciences. To elucidate the characteristics of genetic variants on protein interaction sites, from the viewpoints of the allele frequency and the structural position of variants, we mapped about 20,000 human SNVs onto protein complexes. We found that variants are less abundant in protein interfaces, and specifically the core regions of interfaces. The tendency to “avoid” the interfacial core is stronger among common variants than rare variants. As amino acid substitutions, the trend of mutating amino acids among rare variants is consistent in different interfacial regions, reflecting the fact that rare variants result from random mutations in DNA sequences, whereas amino acid changes of common variants vary between the interfacial core and rim regions, possibly due to functional constraints on proteins. This study illustrated how the allele frequency of variants relates to the protein structural regions and the functional sites in general and will lead to deeper understanding of the potential deleteriousness of rare variants at the structural level. Exceptional cases of the observed trends will shed light on the limitations of structural approaches to evaluate the functional impacts of variants.  相似文献   

16.
We describe the nucleotide sequences of several overlapping cDNA clones specific for human glutaminyl-tRNA synthetase. The identified open reading frame indicates that the enzyme is composed of 1440 amino acids. A stretch of about 360 amino acids of the human enzyme is highly conserved in bacterial and yeast glutaminyl-tRNA synthetases. However, the human enzyme is three times larger than the bacterial and twice as large as the yeast enzyme suggesting that a considerable part of human glutaminyl-tRNA synthetase has evolved to perform functions other than the charging of tRNA. The sequence outside of the conserved core region includes three 57-amino acid repeats followed by a consecutive stretch of 11 charged amino acids. A computer assisted search of two protein data banks reveals that the human glutaminyl-tRNA synthetase shares small blocks of amino acid similarities with several other synthetases of different amino acid specificities. Interestingly, the enzyme also possesses some regions of similarities with eukaryotic translation elongation factor EF-1 but not with any other sequence stored in the protein data banks. The coding regions of human and mouse glutaminyl-tRNA synthetase cDNAs are identical at 94% of the codons. However, the 3'-noncoding regions of mouse and human mRNAs are more divergent (approximately 68%) but both possess the potential to form stable secondary structures of similar general architecture.  相似文献   

17.
The various roles that aggregation prone regions (APRs) are capable of playing in proteins are investigated here via comprehensive analyses of multiple non-redundant datasets containing randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins (IDPs) and catalytic residues. Results from this study indicate that the aggregation propensities of monomeric protein sequences have been minimized compared to random sequences with uniform and natural amino acid compositions, as observed by a lower average aggregation propensity and fewer APRs that are shorter in length and more often punctuated by gate-keeper residues. However, evidence for evolutionary selective pressure to disrupt these sequence regions among homologous proteins is inconsistent. APRs are less conserved than average sequence identity among closely related homologues (≥80% sequence identity with a parent) but APRs are more conserved than average sequence identity among homologues that have at least 50% sequence identity with a parent. Structural analyses of APRs indicate that APRs are three times more likely to contain ordered versus disordered residues and that APRs frequently contribute more towards stabilizing proteins than equal length segments from the same protein. Catalytic residues and APRs were also found to be in structural contact significantly more often than expected by random chance. Our findings suggest that proteins have evolved by optimizing their risk of aggregation for cellular environments by both minimizing aggregation prone regions and by conserving those that are important for folding and function. In many cases, these sequence optimizations are insufficient to develop recombinant proteins into commercial products. Rational design strategies aimed at improving protein solubility for biotechnological purposes should carefully evaluate the contributions made by candidate APRs, targeted for disruption, towards protein structure and activity.  相似文献   

18.
Indels in the coding regions of a gene can either cause frameshifts or amino acid insertions/deletions. Frameshifting indels are indels that have a length that is not divisible by 3 and subsequently cause frameshifts. Indels that have a length divisible by 3 cause amino acid insertions/deletions or block substitutions; we call these 3n indels. The new amino acid changes resulting from 3n indels could potentially affect protein function. Therefore, we construct a SIFT Indel prediction algorithm for 3n indels which achieves 82% accuracy, 81% sensitivity, 82% specificity, 82% precision, 0.63 MCC, and 0.87 AUC by 10-fold cross-validation. We have previously published a prediction algorithm for frameshifting indels. The rules for the prediction of 3n indels are different from the rules for the prediction of frameshifting indels and reflect the biological differences of these two different types of variations. SIFT Indel was applied to human 3n indels from the 1000 Genomes Project and the Exome Sequencing Project. We found that common variants are less likely to be deleterious than rare variants. The SIFT indel prediction algorithm for 3n indels is available at http://sift-dna.org/  相似文献   

19.
E Mndez  C F Arias    S Lpez 《Journal of virology》1996,70(2):1218-1222
The infection of target cells by most animal rotavirus strains requires the presence of sialic acids (SAs) on the cell surface. We recently isolated variants from simian rotavirus RRV whose infectivity is no longer dependent on SAs and showed that the mutant phenotype segregates with the gene coding for VP4, one of the two surface proteins of rotaviruses (the other one being VP7). The nucleotide sequence of the VP4 gene of four independently isolated variants showed three amino acid changes, at positions 37 (Leu to Pro), 187 (Lys to Arg), and 267 (Tyr to Cys), in all mutant VP4 proteins compared with RRV VP4. The characterization of revertant viruses from two independent mutants showed that the arginine residue at position 187 changed back to lysine, indicating that this amino acid is involved in the determination of the mutant phenotype. Surprisingly, sequence analysis of reassortant virus DS1XRRV, which depends on SAs to infect the cell, showed that its VP4 gene is identical to the VP4 gene of the variants. Since the only difference between DS1XRRV and the RRV variants is the parental origin of the VP7 gene (human rotavirus DS1 in the reassortant), these findings suggest that the receptor-binding specificity of rotaviruses, via VP4, may be influenced by the associated VP7 protein.  相似文献   

20.
Hagos Y  Bahn A  Asif AR  Krick W  Sendler M  Burckhardt G 《Biochimie》2002,84(12):29-1224
A pig kidney cDNA library was screened for the porcine ortholog of the multispecific organic anion transporter 1 (pOAT1). Several positive clones were isolated resulting in two alternatively spliced cDNA clones of pOAT1 (pOAT1 and pOAT1A). pOAT1-cDNAs consist of 2126 or 1895 base pairs (EMBL Acc. No. AJ308234 and AJ308235) encoding 547 or 533 amino acid residue proteins with 89, 87, 83 and 81% homology to the human, rabbit, rat, and mouse OAT1, respectively. Heterologous expression of pOAT1 in Xenopus laevis oocytes revealed an apparent K(m) for [3H]PAH of 3.75 +/- 1.6 microM. [3H]PAH uptake mediated by pOAT1 was abolished by 0.5 mM glutarate or 1 mM probenecid. Functional characterization of pOAT1A did not show any affinity for [3H]PAH. In summary, we cloned two alternative splice variants of the pig ortholog of organic anion transporter 1. One splice form (pOAT1) showed typical functional characteristics of organic anion transporter 1, whereas the second form appears not to transport PAH.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号