首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
For applications such as comparative modelling one major issue is the reliability of sequence alignments. Reliable regions in alignments can be predicted using sub-optimal alignments of the same pair of sequences. Here we show that reliable regions in alignments can also be predicted from multiple sequence profile information alone.Alignments were created for a set of remotely related pairs of proteins using five different test methods. Structural alignments were used to assess the quality of the alignments and the aligned positions were scored using information from the observed frequencies of amino acid residues in sequence profiles pre-generated for each template structure. High-scoring regions of these profile-derived alignment scores were a good predictor of reliably aligned regions.These profile-derived alignment scores are easy to obtain and are applicable to any alignment method. They can be used to detect those regions of alignments that are reliably aligned and to help predict the quality of an alignment. For those residues within secondary structure elements, the regions predicted as reliably aligned agreed with the structural alignments for between 92% and 97.4% of the residues. In loop regions just under 92% of the residues predicted to be reliable agreed with the structural alignments. The percentage of residues predicted as reliable ranged from 32.1% for helix residues to 52.8% for strand residues.This information could also be used to help predict conserved binding sites from sequence alignments. Residues in the template that were identified as binding sites, that aligned to an identical amino acid residue and where the sequence alignment agreed with the structural alignment were in highly conserved, high scoring regions over 80% of the time. This suggests that many binding sites that are present in both target and template sequences are in sequence-conserved regions and that there is the possibility of translating reliability to binding site prediction.  相似文献   

2.
3.
Evolution of the cytoskeleton   总被引:1,自引:0,他引:1  
The eukaryotic cytoskeleton appears to have evolved from ancestral precursors related to prokaryotic FtsZ and MreB. FtsZ and MreB show 40-50% sequence identity across different bacterial and archaeal species. Here I suggest that this represents the limit of divergence that is consistent with maintaining their functions for cytokinesis and cell shape. Previous analyses have noted that tubulin and actin are highly conserved across eukaryotic species, but so divergent from their prokaryotic relatives as to be hardly recognizable from sequence comparisons. One suggestion for this extreme divergence of tubulin and actin is that it occurred as they evolved very different functions from FtsZ and MreB. I will present new arguments favoring this suggestion, and speculate on pathways. Moreover, the extreme conservation of tubulin and actin across eukaryotic species is not due to an intrinsic lack of variability, but is attributed to their acquisition of elaborate mechanisms for assembly dynamics and their interactions with multiple motor and binding proteins. A new structure-based sequence alignment identifies amino acids that are conserved from FtsZ to tubulins. The highly conserved amino acids are not those forming the subunit core or protofilament interface, but those involved in binding and hydrolysis of GTP.  相似文献   

4.
5.
Chung JL  Wang W  Bourne PE 《Proteins》2006,62(3):630-640
A rapid increase in the number of experimentally derived three-dimensional structures provides an opportunity to better understand and subsequently predict protein-protein interactions. In this study, structurally conserved residues were derived from multiple structure alignments of the individual components of known complexes and the assigned conservation score was weighted based on the crystallographic B factor to account for the structural flexibility that will result in a poor alignment. Sequence profile and accessible surface area information was then combined with the conservation score to predict protein-protein binding sites using a Support Vector Machine (SVM). The incorporation of the conservation score significantly improved the performance of the SVM. About 52% of the binding sites were precisely predicted (greater than 70% of the residues in the site were identified); 77% of the binding sites were correctly predicted (greater than 50% of the residues in the site were identified), and 21% of the binding sites were partially covered by the predicted residues (some residues were identified). The results support the hypothesis that in many cases protein interfaces require some residues to provide rigidity to minimize the entropic cost upon complex formation.  相似文献   

6.
7.
8.
The chaperonin HSP60 (GroEL) proteins are essential in eubacterial genomes and in eukaryotic organelles. Functional regions inferred from mutation studies and the Escherichia coli GroEL 3D crystal complexes are evaluated in a multiple alignment across 43 diverse HSP60 sequences, centering on ATP/ADP and Mg2+ binding sites, on residues interacting with substrate, on GroES contact positions, on interface regions between monomers and domains, and on residues important in allosteric conformational changes. The most evolutionary conserved residues relate to the ATP/ADP and Mg2+ binding sites. Hydrophobic residues that contribute in substrate binding are also significantly conserved. A large number of charged residues line the central cavity of the GroEL-GroES complex in the substrate-releasing conformation. These span statistically significant intra- and inter-monomer three-dimensional (3D) charge clusters that are highly conserved among sequences and presumably play an important role interacting with the substrate. Unaligned short segments between blocks of alignment are generally exposed at the outside wall of the Anfinsen cage complex. The multiple alignment reveals regions of divergence common to specific evolutionary groups. For example, rickettsial sequences diverge in the ATP/ADP binding domain and gram-positive sequences diverge in the allosteric transition domain. The evolutionary information of the multiple alignment proffers attractive sites for mutational studies.  相似文献   

9.
D'Amico S  Gerday C  Feller G 《Gene》2000,253(1):95-105
The alpha-amylase sequences contained in databanks were screened for the presence of amino acid residues Arg195, Asn298 and Arg/Lys337 forming the chloride-binding site of several specialized alpha-amylases allosterically activated by this anion. This search provides 38 alpha-amylases potentially binding a chloride ion. All belong to animals, including mammals, birds, insects, acari, nematodes, molluscs, crustaceans and are also found in three extremophilic Gram-negative bacteria. An evolutionary distance tree based on complete amino acid sequences was constructed, revealing four distinct clusters of species. On the basis of multiple sequence alignment and homology modeling, invariable structural elements were defined, corresponding to the active site, the substrate binding site, the accessory binding sites, the Ca(2+) and Cl(-) binding sites, a protease-like catalytic triad and disulfide bonds. The sequence variations within functional elements allowed engineering strategies to be proposed, aimed at identifying and modifying the specificity, activity and stability of chloride-dependent alpha-amylases.  相似文献   

10.
Seven highly conserved regions were found in caldesmon molecules from various sources using the multiple sequence alignment method. Their localization coincides with regions where the binding sites to other proteins were postulated. Less conserved and highly divergent regions of the sequences are described as well. These results could refine the planning of caldesmon gene manipulations and accelerate the precise localization of binding sites in the caldesmon molecule and, as a consequence, this could help to elucidate its function in smooth muscle contraction.  相似文献   

11.
We aligned published sequences for the U3 region of 35 type C mammalian retroviruses. The alignment reveals that certain sequence motifs within the U3 region are strikingly conserved. A number of these motifs correspond to previously identified sites. In particular, we found that the enhancer region of most of the viruses examined contains a binding site for leukemia virus factor b, a viral corelike element, the consensus motif for nuclear factor 1, and the glucocorticoid response element. Most viruses containing more than one copy of enhancer sequences include these binding sites in both copies of the repeat. We consider this set of binding sites to constitute a framework for the enhancers of this set of viruses. Other highly conserved motifs in the U3 region include the retrovirus inverted repeat sequence, a negative regulatory element, and the CCAAT and TATA boxes. In addition, we identified two novel motifs in the promoter region that were exceptionally highly conserved but have not been previously described.  相似文献   

12.
13.
14.
Calbindin D28 cDNA clones were isolated from a rat brain library using a chicken intestinal Calbindin D28 cDNA probe. Nucleotide sequence analysis of these clones shows an open reading frame of 78 nucleotide coding for a 261 amino acid 29,994 dalton protein. The predicted amino acid sequence contains six repeats of a domain with the feature of an EF-hand calcium binding site. In domains II and VI, two of the five oxygen-containing amino acids important for the coordination of calcium are absent, suggesting that these two sites have lost their calcium-binding capability. Comparing the amino acid sequence to that recently reported for the chicken Calbindin D28 there is 79% homology. Tolerating conservative differences, the homology increases to 93%. Interestingly, domains II and VI which have presumably lost their calcium binding ability are very conserved among the two species (81% and 78%, respectively). Since an EF hand calcium binding site requires only certain types of amino acids at certain positions, rather than a specific amino acid sequence, maintaining a calcium binding site is a weak conservation pressure. To explain the high degree of homology of rat and chicken Calbindin D28, and in particular the conservation of the two degenerated domains over the 300 million years since divergence of birds and mammals, additional function(s) of the Calbindin D28 are postulated.  相似文献   

15.
The Polycomb Response Element (PRE) is the nucleation site for the Polycomb silencing complexes. The sequences responsible for the recruitment of the components of the Polycomb complex are not well understood. A comparison of the bxd PRE sequences from several different Drosophila species shows that some changes have occurred during phylogeny but large blocks of sequence are conserved after a divergence of some 60 million years. We compare the PRE sequences, the sites of some known PRE binding proteins, the conservation of DNasel hypersensitive sites and relate them to the sequence of the Ultrabithorax promoter which these PREs regulate.  相似文献   

16.
Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements. J. Exp. Zool. ( Mol. Dev. Evol.) 285:128-139, 1999.  相似文献   

17.
18.
19.
Several studies have demonstrated high levels of sequence conservation in noncoding DNA compared between two species (e.g., human and mouse), and interpreted this conservation as evidence for functional constraints. If this interpretation is correct, it suggests the existence of a hidden class of abundant regulatory elements. However, much of the noncoding sequence conserved between two species may result from chance or from small-scale heterogeneity in mutation rates. Stronger inferences are expected from sequence comparisons using more than two taxa, and by testing for spatial patterns of conservation in addition to primary sequence similarity. We used a Bayesian local alignment method to compare approximately 10 kb of intron sequence from nine genes in a pairwise manner between human, whale, and seal to test whether the degree and pattern of conservation is consistent with neutral divergence. Comparison of the three sets of conserved gapless pairwise blocks revealed the following patterns: The proportion of identical intron nucleotides averaged 47% in pairwise comparisons and 28% across the three taxa. Proportions of conserved sequence were similar in unique sequence and general mammalian repetitive elements. We simulated sequence evolution under a neutral model using published estimates of substitution rate heterogeneity for noncoding DNA and found pairwise identity at 33% and three-taxon identity at 16% of nucleotide sites. Spatial patterns of primary sequence conservation were also nonrandomly distributed within introns. Overall, segments of intron sequence closer to flanking exons were significantly more conserved than interior intron sequence. This level of intron sequence conservation is above that expected by chance and strongly suggests that intron sequences are playing a larger functional role in gene regulation than previously realized.  相似文献   

20.
We address the question of whether or not the positions of protein-binding sites on homologous protein structures are conserved irrespective of the identities of their binding partners. First, for each domain family in the Structural Classification of Proteins (SCOP), protein-binding sites are extracted from our comprehensive database of structurally defined binary domain interactions (PIBASE). Second, the binding sites within each family are superposed using a structural alignment of its members. Finally, the degree of localization of binding sites within each family is quantified by comparing it with localization expected by chance. We found that 72% of the 1847 SCOP domain families in PIBASE have binding sites with localization values greater than expected by chance. Moreover, 554 (30%) of these families have localizations that are statistically significant (i.e., more than four standard deviations away from the mean expected by chance). In contrast, only 144 (8%) families have significantly low localization. The absence of a significant correlation of the binding site localization with the average sequence and structural conservations in a family suggests that localization can be helpful for describing the functional diversity of protein-protein interactions, complementing measures of sequence and structural conservation. Consideration of the binding site localization may also result in spatial restraints for the modeling of protein assembly structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号