首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Wu G  Yan S 《Peptides》2003,24(3):347-352
In this data-based theoretical analysis, we use the random approach to analyze the amino acid pairs in 5(IV) chain precursor (CA54) in order to determine which amino acid pairs are more sensitive to 151 variants from missense mutant human CA54 protein. The rationale of this study is based on our hypothesis and previous findings that harmful variance is more likely to occur at randomly unpredictable amino acid pair position rather than at randomly predictable positions. This is reasonable to argue as randomly predictable amino acid pairs are less likely to be deliberately evolved, whereas randomly unpredictable amino acid pairs are probably deliberately evolved in connection with protein function. The results show that all 151 variants occurred at randomly unpredictable amino acid pairs and the chance of a variant occurring is markedly higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, randomly unpredictable amino acid pairs are more sensitive to variance in human CA54. The results also suggest that the human CA54 protein has a natural tendency towards variants.  相似文献   

2.
This is the continuation of our studies using random approaches to analyse the p53 protein family. In this data-based theoretical analysis, we use the random approach to analyse the amino acid pairs in human p53 protein in order to determine which amino acid pairs are more sensitive to 190 human p53 mutations/variants. The rationale of this study is based on our hypothesis and findings that a harmful mutation is more likely to occur at randomly unpredictable amino acid pairs, and a harmless mutation is more likely to occur at randomly predictable amino acid pairs. This is because we argue that the randomly predictable amino acid pairs should not be deliberately evolved, whereas the randomly unpredictable amino acid pairs should be deliberately evolved with a connection to protein function. The results show, for example, that 93.16% of 190 mutations/variants occur at randomly unpredictable amino acid pairs. Thus, the randomly unpredictable amino acid pairs are more sensitive to mutations/variants in human p53 protein. The results also suggest that the human p53 protein has a tendency for the occurrence of mutation/variants.  相似文献   

3.
Wu G  Yan S 《Peptides》2002,23(12):2085-2090
In this data-based theoretical analysis, we use a random approach to estimate amino acid pairs in human phenylalanine 4-hydroxylase (PAH) protein in order to determine which amino acid pairs are more sensitive to 187 variants in human PAH protein. The rationale of this study is based on our hypothesis and previous findings that the harmful variants are more likely to occur at randomly unpredictable amino acid pairs rather than at randomly predictable pairs. This is reasonable to argue as randomly predictable amino acid pairs are less likely to be deliberately evolved, whereas randomly unpredictable amino acid pairs are probably deliberately evolved in connection with protein function. 94.12% of 187 variants occurred at randomly unpredictable amino acid pairs, which accounted for 71.84% of 451 amino acid pairs in human PAH protein. The chance of a variant occurring is five times higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, randomly unpredictable amino acid pairs are more sensitive to variance in human PAH protein. The results also suggest that the human PAH protein has a natural tendency towards variants.  相似文献   

4.
In this study, we use our probabilistic approach to analyze the amino acid pairs in human copper-transporting ATPase 2 (ATP7B) in order to determine which amino acid pairs are more sensitive to 125 variants from missense mutant human ATP7B. The results show 97.6% of 125 variants occur at randomly unpredictable amino acid pairs, which account for 80.9% of amino acid pairs in ATP7B, and the chance of occurring of variant is about 9 times higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, the randomly unpredictable amino acid pairs are more sensitive to variants in human ATP7B.  相似文献   

5.
In this study, we used a random approach to determine which amino acid pairs in human coagulation factor IX precursor are more sensitive to its 99 variants. The results show that the randomly unpredictable amino acid pairs are more sensitive to variants.  相似文献   

6.
In this study, we analyze the amino acid pairs in human protein C precursor to determine which amino acid pairs are more susceptible to 71 variants from missense mutant human protein C precursor. The results show 85.92% of 71 variants occur at randomly unpredictable amino acid pairs accounting for 61.96% of amino acid pairs in protein C.  相似文献   

7.
Over last several years, we demonstrated that the mutations are more likely to occur at randomly unpredictable amino acid pairs in a protein. We therefore can in principle predict the amino acid pairs sensitive to the future mutations in a protein. However, we still need to predict the positions at which the sensitive amino acid pairs are located in a protein. In this study, we use a probabilistic approach to analyze the effect of 191 mutations in human p53 protein and can approximately estimate the sensitive positions to mutations in human p53 protein.  相似文献   

8.
In this study, we use the random principle to analyse the distributions of amino acids and amino acid pairs in human tumour necrosis factor precursor (TNF-!) and its eight mutations, to compare the measured distribution probability with the theoretical distribution probability and to rank the measured distribution probability against the theoretical distribution probability. In this way, we can suggest that distributions with a high random rank should not be deliberately evolved and conserved and those with a low random rank should be deliberately evolved and conserved in human TNF-!. An increased distribution probability in a mutation means probabilistically that the mutation is more likely to occur spontaneously, whereas a decreased distribution probability in a mutation means probabilistically that the mutation is less likely to occur spontaneously and perhaps is more related to a certain cause. The results, for example, show that the distributions of 30% of the amino acids are identical with their probabilistic simplest distributions, and the distributions of some of the remaining amino acids are very close to their probabilistic simplest distributions. With respect to probabilities of distributions of amino acids in mutations, the results show that mutations lead to an increase in eight probabilities, which are thus more likely to occur. Eight probabilities decrease and are thus less likely to occur. With respect to the random ranks against the theoretical probabilities of distributions of amino acids, the results show that mutations lead to an increase in seven and a decrease in seven probabilities, with two probabilities unchanged.  相似文献   

9.
Wu G  Yan S 《Peptides》2003,24(12):1837-1845
In this study, we analyzed the amino acid pairs affected by mutations in two spike proteins from human coronavirus strains 229E and OC43 by means of random analysis in order to gain some insight into the possible mutations in the spike protein from SARS-CoV. The results demonstrate that the randomly unpredictable amino acid pairs are more sensitive to the mutations. The larger is the difference between actual and predicted frequencies, the higher is the chance of mutation occurring. The effect induced by mutations is to reduce the difference between actual and predicted frequencies. The amino acid pairs whose actual frequencies are larger than their predicted frequencies are more likely to be targeted by mutations, whereas the amino acid pairs whose actual frequencies are smaller than their predicted frequencies are more likely to be formed after mutations. These findings are identical to our several recent studies, i.e. the mutations represent a process of degeneration inducing human diseases.  相似文献   

10.
The millions of mutations and polymorphisms that occur in human populations are potential predictors of disease, of our reactions to drugs, of predisposition to microbial infections, and of age-related conditions such as impaired brain and cardiovascular functions. However, predicting the phenotypic consequences and eventual clinical significance of a sequence variant is not an easy task. Computational approaches have found perturbation of conserved amino acids to be a useful criterion for identifying variants likely to have phenotypic consequences. To our knowledge, however, no study to date has explored the potential of variants that occur at homologous positions within paralogous human proteins as a means of identifying polymorphisms with likely phenotypic consequences. In order to investigate the potential of this approach, we have assembled a unique collection of known disease-causing variants from OMIM and the Human Genome Mutation Database (HGMD) and used them to identify and characterize pairs of sequence variants that occur at homologous positions within paralogous human proteins. Our analyses demonstrate that the locations of variants are correlated in paralogous proteins. Moreover, if one member of a variant-pair is disease-causing, its partner is likely to be disease-causing as well. Thus, information about variant-pairs can be used to identify potentially disease-causing variants, extend existing procedures for polymorphism prioritization, and provide a suite of candidates for further diagnostic and therapeutic purposes.  相似文献   

11.
Complementary DNA clones for human glucocerebrosidase were isolated from a human hepatoma library in lambda gt11. The complete nucleotide sequence of the 1805-base pair cDNA insert has been determined. In addition to 5' and 3' untranslated regions (51 and 206 base pairs, respectively), the cDNA insert contains 1548 base pairs that completely encode human glucocerebrosidase. All possible N-linked glycosylation sites are identified. Examination of the 19 amino acids of the leader polypeptide beginning with the ATG at position 52 revealed a hydrophobic core and a carboxyl-terminal glycine at the peptidase cleavage site, features consistent with the leader sequences described for other human translocated proteins. The Mr of 57,000 calculated from the 516 amino acids deduced from cDNA sequence is in good agreement with that identified by immunoprecipitation following in vitro translation of human placental mRNA.  相似文献   

12.
The repeated amino-acid sequences in Citrobacter Freundii beta-lactamase may be indispensable for its function, because such repetitions cannot be simply attributed to a chance. In order to fully explore the functional units in Citrobacter Freundii beta-lactamase, it may need to analyse all the amino acid pairs, triplets, etc. along Citrobacter Freundii beta-lactamase from one terminal to the other terminal, to count their frequencies and calculate their probabilities. The amino-acid sequence of Citrobacter Freundii beta-lactamase was counted according to two-, three- and four-amino-acid sequences. The counted frequency and probability were compared with the predicted frequency and probability. The amino acid sequences, which appear in Citrobacter Freundii beta-lactamase and can be predicted from its amino acid composition according to a purely random mechanism, should not be deliberately evolved and conserved. By contrast, the amino acid sequences, which appear in Citrobacter Freundii beta-lactamase but cannot be predicted from its amino acid composition according to a purely random mechanism, should be deliberately evolved and conversed. Accordingly 99 (26.053%) and 33 (8.684%) of 380 two-amino-acid sequences can be predicted by the frequency and probability according to a purely random mechanism. Some kinds of amino acid sequences, which absent in Citrobacter Freundii beta-lactamase and can be predicted from its amino acid composition according to a purely random mechanism, should not be deliberately excluded from Citrobacter Freundii beta-lactamase. By contrast, some kinds of amino acid sequences, which absent in Citrobacter Freundii beta-lactamase and cannot be predicted from its amino acid composition according to a purely random mechanism, should be deliberately excluded from Citrobacter Freundii beta-lactamase. Accordingly 89 (48.370%) and 41 (22.283%) of 184 kinds of absent two-amino-acid sequences can be predicted by the frequency and probability according to a purely random mechanism, and 7236 (99.848%) of 7247 kinds of absent three-amino-acid sequences can be predicted by the frequency according to a purely random mechanism. The amino acids, whose probabilities in following certain preceding amino acids can be predicted from Citrobacter Freundii beta-lactamase amino acid composition according to a purely random mechanism, should not be deliberately evolved and conversed, accordingly 2 (0.526%) of 380 counted first order Markov transition probabilities for the second amino acid in two-amino-acid sequences match the predicted conditional probabilities.  相似文献   

13.
Sequence variants in recombinant biopharmaceuticals may have a relevant and unpredictable impact on clinical safety and efficacy. Hence, their sensitive analysis is important throughout bioprocess development. The two stage analytical approach presented here provides a quick multi clone comparison of candidate production cell lines as a first stage, followed by an in-depth analysis including identification and quantitation of aberrant sequence variants of selected clones as a second stage. We show that the differential analysis is a suitable tool for sensitive and fast batch to batch comparison of recombinant proteins. The optimized approach allows for detection of not only single amino acid substitutions in unmodified peptides, but also substitutions in posttranslational modified peptides such as glycopeptides, for detection of truncated or elongated sequence variants as well as double amino acid substitutions or substitution with amino acid structural isomers within one peptide. In two case studies we were able to detect sequence variants of different origin down to a sub percentage level. One of the sequence variants (Thr → Asn) could be correlated to a cytosine to adenine substitution at DNA (desoxyribonucleic acid) level. In the second case we were able to correlate the sub percentage substitution (Phe → Tyr) to amino acid limitation in the chemically defined fermentation medium.  相似文献   

14.
The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans.  相似文献   

15.
Summary The distribution of human hemoglobin variants has previously been studied by Vogel (1972) who concluded that the distribution was random although no statistical analysis was presented. This work points out that there are four biases in the data, one in the manner in which the number of variants is counted, another in the method by which they are detected and which favors charge changes, a third in the fact that for a few codons the same amino acid replacement may be brought about by two or three single nucleotide replacements, and a fourth in the non-random sampling procedure which favors variants producing clinical symptoms. Nevertheless, the distribution of beta hemoglobin variants is confirmed to be random as Vogel suggests. The alpha hemoglobin variants are distinctly non-randomly distributed, the best fit requiring that 69 of the alpha positions be considered invariable. The above biases could account for this result but other considerations combine to suggest the following: 1, about half of all alterations of alpha hemoglobin will not survive to sampling whereas nearly all beta variants can; 2, deleterious mutants that survive to sampling but are destined to be eliminated by selection are more likely to be observed in beta than in alpha hemoglobin; and 3, mutations destined to go to fixation are more likely to occur in beta than in alpha hemoglobin.  相似文献   

16.
Human genetic variation is the incarnation of diverse evolutionary history, which reflects both selectively advantageous and selectively neutral change. In this study, we catalogue structural and functional features of proteins that restrain genetic variation leading to single amino acid substitutions. Our variation dataset is divided into three categories: i) Mendelian disease-related variants, ii) neutral polymorphisms and iii) cancer somatic mutations. We characterize structural environments of the amino acid variants by the following properties: i) side-chain solvent accessibility, ii) main-chain secondary structure, and iii) hydrogen bonds from a side chain to a main chain or other side chains. To address functional restraints, amino acid substitutions in proteins are examined to see whether they are located at functionally important sites involved in protein-protein interactions, protein-ligand interactions or catalytic activity of enzymes. We also measure the likelihood of amino acid substitutions and the degree of residue conservation where variants occur. We show that various types of variants are under different degrees of structural and functional restraints, which affect their occurrence in human proteome.  相似文献   

17.
Xi T  Jones IM  Mohrenweiser HW 《Genomics》2004,83(6):970-979
Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant from Tolerant (SIFT) classified 226 of 508 variants (44%) as "Intolerant." Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as "Probably or possibly damaging." Another 9-15% of the variants were classed as "Potentially intolerant or damaging." The results from the two algorithms are highly associated, with concordance in predicted impact observed for approximately 62% of the variants. Twenty-one to thirty-one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as "Tolerant" or "Benign." Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.  相似文献   

18.
In addition to the lysosomal glucocerebrosidase, a distinct β-glucosidase that is also active towards glucosylceramide could be demonstrated in various human tissues and cell types. Subcellular fractionation analysis revealed that the hitherto undescribed glucocerebrosidase is not located in lysosomes but in compartments with a considerably lower density. The non-lysosomal glucocerebrosidase differed in several respects from lysosomal glucocerebrosidase. The non-lysosomal isoenzyme proved to be tightly membrane-bound, whereas lysosomal glucocerebrosidase is weakly membrane-associated. The pH optimum of the non-lysosomal isoenzyme is less acidic than that of lysosomal glucocerebrosidase. Non-lysosomal glucocerebrosidase, in contrast to the lysosomal isoenzyme, was not inhibited by low concentrations of conduritol B-epoxide, was markedly inhibited by taurocholate, was not stimulated in activity by the lysosomal activator protein saposin C, and was not deficient in patients with Gaucher disease. Non-lysosomal glucocerebrosidase proved to be less sensitive to inhibition by castanospermine or deoxynojirimycin but more sensitive to inhibition by D-gluconolactone than the lysosomal glucocerebrosidase. The physiological function of this second, non-lysosomal, glucocerebrosidase is as yet unknown.  相似文献   

19.
To understand how protein segments are inserted and deleted during divergent evolution, a set of pairwise alignments contained exactly one gap, and therefore arising from the first insertion-deletion (indel) event in the time separating the homologs, was examined. The alignments showed that "structure breaking" amino acids (PGDNS) were preferred within and flanking gapped regions, as are two residues with hydrophilic side-chains (QE) that frequently occur at the surface of protein folds. Conversely, hydrophobic residues (FMILYVW) occur infrequently within and flanking the gapped region. These preferences are modestly different in protein pairs separated by an episode of adaptive evolution, than in pairs diverging under strong functional constraints. Surprisingly, regions near an indel have not evolved more rapidly than the sequence pair overall, showing no evidence that an indel event must be compensated by local amino acid replacement. The gap-lengths are best approximated by a Zipfian distribution, with the probability of a gap of length L decreasing as a function of L(-1.8). These features are largely independent of the length of the gap and the extent of divergence (measured by both silent and non-silent sequence changes) separating the two proteins. Surprisingly, amino acid repeats were discovered in more than a third of the polypeptide segments in and around the gap. These correspond to repeats in the DNA sequence. This suggests that a signature of the mechanism by which indels occur in the DNA sequence remains in the encoded protein sequences. These data suggest specific tools to score gap placement in an alignment. They also suggest tools that distinguish true indels from gaps created by mistaken gene finding, including under-predicted and over-predicted introns. By providing mechanisms to identify errors, the tools will enhance the value of genome sequence databases in support of integrated paleogenomics strategies used to extract functional information in a post-genomic environment.  相似文献   

20.
Saposin C is a sphingolipid activator protein of 8.5 kDa that activates lysosomal glucocerebrosidase. Previously, we synthesized and characterized a synthetic full-length human saposin C protein that displays 85% of the activity of the native saposin C. In this study we use shorter synthetic peptides derived from the saposin C sequence to map binding and activation sites. By determining the activity and kinetic constant (Kact) values of these peptides, we have identified two functional domains, each comprising a binding site adjacent to or partially overlapping with an activation site. Domains 1 and 2 are located within amino acid positions 6-34 and 41-60, respectively. The activation sites span residues 27-34 and 41-49, whereas binding sites encompass residues 6-27 and 45-60. Peptides containing the sequences of either domain displayed 90% of the activity of the full-length synthetic saposin C. Domain 2, however, bound to glucocerebrosidase by at least an order of magnitude more strongly than domain 1. Binding sites within these domains contain sequences that are excellent candidates for forming amphipathic helical structures. Competition assays demonstrated that the binding of one domain to glucocerebrosidase prevents binding of the other domain, and that saposin A and saposin C bind to the same sites on glucocerebrosidase. A model predicting a saposin C:glucocerebrosidase complex with a stoichiometry of 4:2, respectively, is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号