首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 602 毫秒
1.
Small and large subunits ofEscherichia coli ribosome have three different rRNAs, the sequences of which are known. However, attempts by three groups to predict secondary structures of 16S and 23S rRNAs have certain common limitations namely, these structures are predicted assuming no interactions among various domains of the molecule and only 40% residues are involved in base pairing as against the experimental observation of 60 % residues in base paired state. Recent experimental studies have shown that there is a specific interaction between naked 16S and 23S rRNA molecules. This is significant because we have observed that the regions (oligonucleotides of length 9–10 residues), in 16S rRNA which are complementary to those in 23S rRNA do not have internal complementary sequences. Therefore, we have developed a simple graph theoretical approach to predict secondary structures of 16S and 23S rRNAs. Our method for model building not only uses complete sequence of 16S or 23S rRNA molecule along with other experimental observations but also takes into account the observation that specific recognition is possible through the complementary sequences between 16S and 23S rRNA molecules and, therefore, these parts of the molecules are not used for internal base pairing. The method used to predict secondary structures is discussed. A typical secondary structure of the complex between 16S and 23S rRNA molecules, obtained using our method, is presented and compared Briefly with earlier model Building studies.  相似文献   

2.
We present the sequence of the 5' terminal 585 nucleotides of mouse 28S rRNA as inferred from the DNA sequence of a cloned gene fragment. The comparison of mouse 28S rRNA sequence with its yeast homolog, the only known complete sequence of eukaryotic nucleus-encoded large rRNA (see ref. 1, 2) reveals the strong conservation of two large stretches which are interspersed with completely divergent sequences. These two blocks of homology span the two segments which have been recently proposed to participate directly in the 5.8S-large rRNA complex in yeast (see ref. 1) through base-pairing with both termini of 5.8S rRNA. The validity of the proposed structural model for 5.8S-28S rRNA complex in eukaryotes is strongly supported by comparative analysis of mouse and yeast sequences: despite a number of mutations in 28S and 5.8S rRNA sequences in interacting regions, the secondary structure that can be proposed for mouse complex is perfectly identical with yeast's, with all the 41 base-pairings between the two molecules maintained through 11 pairs of compensatory base changes. The other regions of the mouse 28S rRNA 5'terminal domain, which have extensively diverged in primary sequence, can nevertheless be folded in a secondary structure pattern highly reminiscent of their yeast' homolog. A minor revision is proposed for mouse 5.8S rRNA sequence.  相似文献   

3.
Kurgan LA  Zhang T  Zhang H  Shen S  Ruan J 《Amino acids》2008,35(3):551-564
Structural class categorizes proteins based on the amount and arrangement of the constituent secondary structures. The knowledge of structural classes is applied in numerous important predictive tasks that address structural and functional features of proteins. We propose novel structural class assignment methods that use one-dimensional (1D) secondary structure as the input. The methods are designed based on a large set of low-identity sequences for which secondary structure is predicted from their sequence (PSSAsc model) or assigned based on their tertiary structure (SSAsc). The secondary structure is encoded using a comprehensive set of features describing count, content, and size of secondary structure segments, which are fed into a small decision tree that uses ten features to perform the assignment. The proposed models were compared against seven secondary structure-based and ten sequence-based structural class predictors. Using the 1D secondary structure, SSAsc and PSSAsc can assign proteins to the four main structural classes, while the existing secondary structure-based assignment methods can predict only three classes. Empirical evaluation shows that the proposed models are quite promising. Using the structure-based assignment performed in SCOP (structural classification of proteins) as the golden standard, the accuracy of SSAsc and PSSAsc equals 76 and 75%, respectively. We show that the use of the secondary structure predicted from the sequence as an input does not have a detrimental effect on the quality of structural class assignment when compared with using secondary structure derived from tertiary structure. Therefore, PSSAsc can be used to perform the automated assignment of structural classes based on the sequences.  相似文献   

4.
Accurately predicted protein secondary structure provides useful information for target selection, to analyze protein function and to predict higher dimensional structure. Existing research shows that more data + refined search = better prediction. We analyze relation between the prediction accuracy and another crucial factor, the protein size. Empirical tests performed with two secondary structure predictors on a large set of high-resolution, non-redundant proteins show that the average accuracies for small proteins (<100 residues) equal 73% and 54% for alpha-helices and beta-strands, respectively. The alpha-helix/beta-strand accuracies for very large proteins (>300 residues) equal 77%/68%, respectively. Similarly, the tests with three secondary structure content predictors show that the prediction errors for the small/very large proteins equal 0.13/0.09 and 0.09/0.06 for alpha-helix and beta-strand content, respectively. Our tests confirm that the secondary structure/content predictions for the very large proteins are characterized statistically significantly better quality than prediction for the small proteins. This is in contrast with the tertiary structure predictions in which higher accuracy is obtained for smaller proteins.  相似文献   

5.
The ribosome is a large molecular complex that consists of at least three ribonucleic acid molecules and a large number of proteins. It translates genetic information from messenger ribonucleic acid and makes protein accordingly. To better understand ribosomal function and provide information for designing biochemical experiments require knowledge of the complete structure of the ribosome. For expanding the structural information of the ribosome, we took on the challenge of developing a detailed Thermus thermophilus ribosomal structure computationally. By combining information derived from the low-resolution x-ray structure of the 70S ribosome (providing the overall fold), high-resolution structures of the ribosomal subunits (providing the local structure), sequences, and secondary structures, we have developed an atomic model of the T. thermophilus ribosome using a homology modeling approach. Our model is stereochemically sound with a consistent single-species sequence. The overall folds of the three ribosomal ribonucleic acids in our model are consistent with those in the low-resolution crystal structure (root mean-square differences are all <1.9 Å). The large overall interface area (~2500 Å2) of intersubunit bridges B2a, B3, and B5, and the inherent flexibility in regions connecting the contact residues are consistent with these bridges serving as anchoring patches for the ratcheting and rolling motions between the two subunits during translocation.  相似文献   

6.
Secondary structures of leukocyte alpha 1- and alpha 2-interferons and of fibroblast beta-interferon are calculated using the molecular theory of protein secondary structures. The common secondary structure calculated for alpha- and beta-interferons is used to predict the three-dimensional structures of fragments 1-110 and 111-166 of the chains (which are supposed to be quasi-independent domains). The predicted structure of the active domain I (1-110) is an "up-and-down" tetrahelical complex (in which the second helix is shorter than the others and can be missed in alpha 1-interferon) similar to the mirror-image of myohaemoerythrin. The predicted structure of domain II (111-166) is either a three-stranded beta-sheet screened from one side by two alpha-helices or a three-helical complex (similar to that in the N-domain of papain), the first structure being better consistent with the circular dichroism data of alpha-interferon and its C-end fragment.  相似文献   

7.
The infrared amide bands are sensitive to the conformation of the polypeptide backbone of proteins. Since the backbone of proteins folds in complex spatial arrangements, the amide bands of these proteins result from the superimposition of vibration modes corresponding to the different types of structural motifs (alpha helices, beta sheets, etc.). Initially, band deconvolution techniques were applied to determine the secondary structure of proteins, i.e., the abundance of each structural motif in the polypeptide chain was directly related to the area of the suitable deconvolved vibration modes under the amide I band (1700-1600 cm(-1)). Recently, several multivariate regression methods have been used to predict the secondary structure of proteins as an alternative to the previous methods. They are based on establishing a relationship between a matrix of infrared protein spectra and another that includes their secondary structure, expressed as the fractions of the different structural motifs, determined from X-ray analysis. In this study, we investigated the use of the local regression method interval partial least-squares (iPLS) to seek improvements to the full-spectrum PLS and other regression methods. The local character of iPLS avoids the use of spectral regions that can introduce noise or that can be irrelevant for prediction and focuses on finding specific spectral ranges related to each secondary structure motif in all the proteins. This study has been applied to a representative protein data set with infrared spectra covering a large wavenumber range, including amides I-III bands (1700-1200 cm(-1)). iPLS has revealed new structural mode assignments related to less explored amide bands and has offered a satisfactory predictive ability using a small amount of selected specific spectral information.  相似文献   

8.
J M Kean  D E Draper 《Biochemistry》1985,24(19):5052-5061
A technique for isolating defined fragments of a large RNA has been developed and applied to a ribosomal RNA. A section of the Escherichia coli rrnB cistron corresponding to the S8/S15 protein binding domain of 16S ribosomal RNA was cloned into a single-stranded DNA phage; after hybridization of the phage DNA with 16S RNA and digestion with T1 ribonuclease, the protected RNA was separated from the DNA under denaturing conditions to yield a 345-base RNA fragment with unique ends (bases 525-869 in the 16S sequence). The secondary structure of this fragment was determined by mapping the cleavage sites of enzymes specific for single-stranded or double-helical RNA. The fragment structure is almost identical with that proposed for the corresponding region of intact 16S RNA on the basis of phylogenetic comparisons [Woese, C. R., Gutell, R., Gupta, R., & Noller, H. (1983) Microbiol. Rev. 47, 621-669]. We conclude that this section of RNA constitutes an independently folding domain that may be studied in isolation from the rest of the 16S RNA. The structure mapping experiments have indicated several interesting features in the RNA structure. (i) The largest bulge loop in the molecule (20 bases) contains specific tertiary structure. (ii) A region of long-range secondary structure, pairing bases about 200 residues apart in the sequence, can hydrogen bond in two different mutually exclusive schemes. Both appear to exist simultaneously in the RNA fragment under our conditions. (iii) The long-range secondary structure and one adjacent helix melt between 37 and 60 degrees C in the absence of Mg2+, while the rest of the structure is quite stable.  相似文献   

9.
Electron microscopy revealed reproducible secondary structure patterns within partially denatured 16S and 23S ribosomal ribonucleic acid (rRNA) from Escherichia coli. When prepared with 50% formamide-100 mM ammonium acetate, 16S rRNA included two small hairpins that appeared in over 50% of all molecules. Three open loops were observed with frequencies of less than 25%. In contrast, 23S rRNA included a terminal open loop and two additional large structures in over 75% of all molecules. These secondary structure patterns were conserved in the 16S and 23S rRNA from Pseudomonas aeruginosa. The secondary structure of the 30S precursor rRNA from the ribonclease III-deficient E. coli mutant AB105 was mapped after partial denaturation in 70% formamide-100 mM ammonium acetate. Two large open loops were superimposed on the 16S and 23S rRNA secondary structure patterns. These loops were the most frequent structures found on the precursor, and their stems coincided with ribonuclease III cleavage sites. A tentative 5'-3 orientation was determined for the secondary structure patterns of 16S and 23S rRNA from their relative locations within 30S precursor rRNA. The relation of secondary structure to ribosomal protein binding and ribonuclease III cleavage is discussed.  相似文献   

10.
Recently published alignments of available 5 S rRNA sequences have shown that a rigid base pairing pattern, pointing to the existence of a universal five-helix secondary structure for all 5 S RNAs, can be superimposed on such alignments. For a few species, the alignment and the base pairing pattern show distortions with respect to the large majority of sequences. Their 5 S RNAs may form exceptional secondary structures, or there may just be errors in the published sequences. We have examined such a case, Pseudomonas fluorescens, and found the sequence to be in error. The corrected sequence, as well as those of the related species Azotobacter vinelandii and Pseudomonas aeruginosa, fit perfectly in the 5 S RNA sequence alignment and in the five-helix secondary structure model. There exists comparative evidence for the frequent presence of non-standard base pairs at several points of the 5 S RNA secondary structure.  相似文献   

11.
Substitution of Asn for the conserved Ser543 in the thumb subdomain of the Taq DNA polymerase large fragment (Klentaq DNA polymerase) prevents pausing during DNA synthesis and allows the enzyme to circumvent template regions with a complex structure. The mutant enzyme (KlentaqN DNA polymerase) provides specific PCR amplification and sequencing of difficult templates, e.g. those with a high GC% content or strong secondary structure.  相似文献   

12.
It has been many years since position-specific residue preference around the ends of a helix was revealed. However, all the existing secondary structure prediction methods did not exploit this preference feature, resulting in low accuracy in predicting the ends of secondary structures. In this study, we collected a relatively large data set consisting of 1860 high-resolution, non-homology proteins from the PDB, and further analyzed the residue distributions around the ends of regular secondary structures. It was found that there exist position-specific residue preferences (PSRP) around the ends of not only helices but also strands. Based on the unique features, we proposed a novel strategy and developed a tool named E-SSpred that treats the secondary structure as a whole and builds models to predict entire secondary structure segments directly by integrating relevant features. In E-SSpred, the support vector machine (SVM) method is adopted to model and predict the ends of helices and strands according to the unique residue distributions around them. A simple linear discriminate analysis method is applied to model and predict entire secondary structure segments by integrating end-prediction results, tri-peptide composition, and length distribution features of secondary structures, as well as the prediction results of the most famous program PSIPRED. The results of fivefold cross-validation on a widely used data set demonstrate that the accuracy of E-SSpred in predicting ends of secondary structures is about 10% higher than PSIPRED, and the overall prediction accuracy (Q(3) value) of E-SSpred (82.2%) is also better than PSIPRED (80.3%). The E-SSpred web server is available at http://bioinfo.hust.edu.cn/bio/tools/E-SSpred/index.html.  相似文献   

13.
The defective parvovirus Adeno-associated virus (AAV) is absolutely dependent upon coinfection with either Adenovirus or Herpes Simplex Virus (HSV) for its multiplication. We have compared the terminal repeats of HSV-1F strain DNA with the terminal 200 nucleotides of AAV DNA. Our findings demonstrate similarities between portions of the HSV inverted repeats found at the L/S junction and the termini of AAV. By computer analysis we have determined potential secondary folding patterns for both genomes. The following points can be made about the a, b, and c repeats in HSV: (1) Regions b and c are complementary over a significant portion of their length. (2) The ends of a can fold back on themselves to form large secondary structures. Moreover, when the b and c homology is used to align the ends of a, the b/a and c/a junctions are within 1 base of each other. (3) The short direct repeats within a are essentially a large loop with little secondary structure. The potential implications of this structure are discussed and a model for HSV DNA replication is presented.  相似文献   

14.
15.
We study the secondary structure of RNA determined by Watson–Crick pairing without pseudo-knots using Milnor invariants of links. We focus on the first non-trivial invariant, which we call the Heisenberg invariant. The Heisenberg invariant, which is an integer, can be interpreted in terms of the Heisenberg group as well as in terms of lattice paths. We show that the Heisenberg invariant gives a lower bound on the number of unpaired bases in an RNA secondary structure. We also show that the Heisenberg invariant can predict allosteric structures for RNA. Namely, if the Heisenberg invariant is large, then there are widely separated local maxima (i.e., allosteric structures) for the number of Watson–Crick pairs found. Partially supported by DST (under grant DSTO773) and UGC (under SAP-DSA Phase IV).  相似文献   

16.
17.
Convolutional Neural Networks (CNNs) are statistical models suited for learning complex visual patterns. In the context of Species Distribution Models (SDM) and in line with predictions of landscape ecology and island biogeography, CNN could grasp how local landscape structure affects prediction of species occurrence in SDMs. The prediction can thus reflect the signatures of entangled ecological processes. Although previous machine-learning based SDMs can learn complex influences of environmental predictors, they cannot acknowledge the influence of environmental structure in local landscapes (hence denoted “punctual models”). In this study, we applied CNNs to a large dataset of plant occurrences in France (GBIF), on a large taxonomical scale, to predict ranked relative probability of species (by joint learning) to any geographical position. We examined the way local environmental landscapes improve prediction by performing alternative CNN models deprived of information on landscape heterogeneity and structure (“ablation experiments”). We found that the landscape structure around location crucially contributed to improve predictive performance of CNN-SDMs. CNN models can classify the predicted distributions of many species, as other joint modelling approaches, but they further prove efficient in identifying the influence of local environmental landscapes. CNN can then represent signatures of spatially structured environmental drivers. The prediction gain is noticeable for rare species, which open promising perspectives for biodiversity monitoring and conservation strategies. Therefore, the approach is of both theoretical and practical interest. We discuss the way to test hypotheses on the patterns learnt by CNN, which should be essential for further interpretation of the ecological processes at play.  相似文献   

18.
An α/β barrel is predicted for the three-dimensional (3D) structure of Bacillus subtilis ferrochelatase. To arrive at this structure, the THREADER program was used to find possible homologous 3D structures and to predict the secondary structure for the ferrochelatase sequence. The secondary structure was fit by hand to the selected homologous 3D structure then the MODELLER program was used to predict the fold of ferrochelatase. Molecular biological information about the conserved residues of ferrochelatase was used as the criteria to help select the homologous 3D structure used to predict the fold of ferrochelatase. Based on the predicted structure possible, ligands binding to the iron and protoporphyrin IX are discussed. The structure has been deposited in the Brookhaven database as ID 1FJI. © 1997 Wiley-Liss Inc.  相似文献   

19.
Given sufficient large protein families, and using a global statistical inference approach, it is possible to obtain sufficient accuracy in protein residue contact predictions to predict the structure of many proteins. However, these approaches do not consider the fact that the contacts in a protein are neither randomly, nor independently distributed, but actually follow precise rules governed by the structure of the protein and thus are interdependent. Here, we present PconsC2, a novel method that uses a deep learning approach to identify protein-like contact patterns to improve contact predictions. A substantial enhancement can be seen for all contacts independently on the number of aligned sequences, residue separation or secondary structure type, but is largest for β-sheet containing proteins. In addition to being superior to earlier methods based on statistical inferences, in comparison to state of the art methods using machine learning, PconsC2 is superior for families with more than 100 effective sequence homologs. The improved contact prediction enables improved structure prediction.  相似文献   

20.
RNA pseudoknot prediction in energy-based models.   总被引:11,自引:0,他引:11  
RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g., as catalytic molecules. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactorily in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Allowing pseudoknots introduces modeling and computational problems. In this paper we consider the problem of predicting RNA secondary structures with pseudoknots based on free energy minimization. We first give a brief comparison of energy-based methods for predicting RNA secondary structures with pseudoknots. We then prove that the general problem of predicting RNA secondary structures containing pseudoknots is NP complete for a large class of reasonable models of pseudoknots.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号