首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized.  相似文献   

3.
To establish possible function of a newly discovered protein, alignment of its sequence with other known sequences is required. When the similarity is marginal, the function remains uncertain. A principally new approach is suggested: to use networks in the protein sequence space. The functionality of the protein is firmly established via networks forming chains of consecutive pair-wise matching fragments. The distant relatives are, thus, considered as relatives, though in some cases, there is even no sequence match between the ends of the chain, while the entire chain belongs to the same functional and structural network.  相似文献   

4.
Identifying protein–protein interactions (PPIs) is critical for understanding the cellular function of the proteins and the machinery of a proteome. Data of PPIs derived from high-throughput technologies are often incomplete and noisy. Therefore, it is important to develop computational methods and high-quality interaction dataset for predicting PPIs. A sequence-based method is proposed by combining correlation coefficient (CC) transformation and support vector machine (SVM). CC transformation not only adequately considers the neighboring effect of protein sequence but describes the level of CC between two protein sequences. A gold standard positives (interacting) dataset MIPS Core and a gold standard negatives (non-interacting) dataset GO-NEG of yeast Saccharomyces cerevisiae were mined to objectively evaluate the above method and attenuate the bias. The SVM model combined with CC transformation yielded the best performance with a high accuracy of 87.94% using gold standard positives and gold standard negatives datasets. The source code of MATLAB and the datasets are available on request under smgsmg@mail.ustc.edu.cn.  相似文献   

5.
6.
Increasing awareness of the importance of protein–RNA interactions has motivated many approaches to predict residue-level RNA binding sites in proteins based on sequence or structural characteristics. Sequence-based predictors are usually high in sensitivity but low in specificity; conversely structure-based predictors tend to have high specificity, but lower sensitivity. Here we quantified the contribution of both sequence- and structure-based features as indicators of RNA-binding propensity using a machine-learning approach. In order to capture structural information for proteins without a known structure, we used homology modeling to extract the relevant structural features. Several novel and modified features enhanced the accuracy of residue-level RNA-binding propensity beyond what has been reported previously, including by meta-prediction servers. These features include: hidden Markov model-based evolutionary conservation, surface deformations based on the Laplacian norm formalism, and relative solvent accessibility partitioned into backbone and side chain contributions. We constructed a web server called aaRNA that implements the proposed method and demonstrate its use in identifying putative RNA binding sites.  相似文献   

7.
How are model protein structures distributed in sequence space?   总被引:6,自引:0,他引:6       下载免费PDF全文
The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.  相似文献   

8.
The amino terminal amino acid sequence of the 41,000 dalton subunit of Electrophoruselectricus acetylcholine receptor has been determined for 35 cycles by automated sequencing procedures. Comparison of the unique polypeptide sequence obtained for this molecule with that of the major subunit of Torpedocalifornica acetylcholine receptor reveals extensive primary structural homology between the two proteins.  相似文献   

9.
10.
Protein interfaces are thought to be distinguishable from the rest of the protein surface by their greater degree of residue conservation. We test the validity of this approach on an expanded set of 64 protein-protein interfaces using conservation scores derived from two multiple sequence alignment types, one of close homologs/orthologs and one of diverse homologs/paralogs. Overall, we find that the interface is slightly more conserved than the rest of the protein surface when using either alignment type, with alignments of diverse homologs showing marginally better discrimination. However, using a novel surface-patch definition, we find that the interface is rarely significantly more conserved than other surface patches when using either alignment type. When an interface is among the most conserved surface patches, it tends to be part of an enzyme active site. The most conserved surface patch overlaps with 39% (+/- 28%) and 36% (+/- 28%) of the actual interface for diverse and close homologs, respectively. Contrary to results obtained from smaller data sets, this work indicates that residue conservation is rarely sufficient for complete and accurate prediction of protein interfaces. Finally, we find that obligate interfaces differ from transient interfaces in that the former have significantly fewer alignment gaps at the interface than the rest of the protein surface, as well as having buried interface residues that are more conserved than partially buried interface residues.  相似文献   

11.
Transient protein–protein interactions play a vital role in many biological processes, such as cell regulation and signal transduction. A nonredundant dataset of 130 protein chains extracted from transient complexes was used to analyze the features of transient interfaces. It was found that besides the two well-known features, sequence profile and accessible surface area (ASA), the temperature factor (B-factor) can also reflect the differences between interface and the rest of protein surface. These features were utilized to construct support vector machine (SVM) classifiers to identify interaction sites. The results of threefold cross-validation on the nonredundant dataset show that when B-factor was used as an additional feature, the prediction performance can be improved significantly. The sensitivity, specificity and correlation coefficient were raised from 54 to 62%, 41 to 45% and 0.20 to 0.29, respectively. To further illustrate the effectiveness of our method, the classifiers were tested with an independent set of 53 nonhomologous protein chains derived from benchmark 2.0. The sensitivity, specificity and correlation coefficient of the classifier based on the three features were 63%, 45% and 0.33, respectively. It is indicated that our classifiers are robust and can be applied to complement experimental techniques in studying transient protein–protein interactions.  相似文献   

12.
The total amino acid sequence of a lambda Bence-Jones protein has been established. The protein contains 211 residues, which include two methionine residues. Splitting with cyanogen bromide gave three fragments, the largest of which included the C-terminal half, which is common to other Bence-Jones proteins of the same type. The peptides obtained by tryptic, chymotryptic and peptic digestion were isolated and purified by paper-electrophoretic and chromatographic techniques. Reduction followed by carboxymethylation of the cysteine residues with radioactive iodoacetate was found to be a powerful tool in the isolation of some insoluble peptides. Unusual features of the molecule are the fact that it contains six cysteine residues and not five as observed in both kappa and lambda Bence-Jones proteins studied previously, and its size, which seems two residues smaller than the smallest Bence-Jones protein studied hitherto. The similarities and differences between this and other Bence-Jones proteins are discussed.  相似文献   

13.
We have isolated the F0F1-ATP synthase complex from oligomycin-sensitive mitochondria of the green algaChlamydomonas reinhardtii. A pure and active ATP synthase was obtained by eans of sonication, extraction with dodecyl maltoside and ion exchange and gel permeation chromatography in the presence of glycerol, DTT, ATP and-21. The enzyme consists of 14 subunits as judged by SDS-PAGE. A cDNA clone encoding the ATP synthase subunit has been sequenced. The deduced protein sequence contains a presequence of 45 amino acids which is not present in the mature protein. The mature protein is 58–70% identical to corresponding mitochondrial proteins from other organisms. In contrast to the ATP synthase subunit fromC. reinhardtii (Franzen and Falk, Plant Mol Biol 19 (1992) 771–780), the protein does not have a C-terminal extension. However, the N-terminal domain of the mature protein is 15–18 residues longer than in ATP synthase subunits from other organisms. Southern blot analysis indicates that the protein is encoded by a single-copy gene.Abbreviations DM dodecyl--D-maltoside - OSCP oligomycin sensitivity conferring protein - PMSF phenyl-methylsulfonylfluoride - DTT dithiothreitol - EDTA ethylenediaminotetraacetic disodium salt  相似文献   

14.
In the current scenario, widespread multidrug resistivity in ESKAPE pathogens demands identification of novel drug targets to keep their infections at bay. For this purpose, we have identified a novel target Hpa2 of A. baumannii, a member of GNAT superfamily of HATs. But due to sequence identity of equal or less than 35%, the correct sequence alignment and construction of 3D monomeric and dimeric models of Hpa2 having optimal structural parameters is a troublesome task. To circumvent these problems, we have designed an easy and optimized protocol for Hpa2 monomer modeling, and for generation of dimeric Hpa2 model using data-driven protein–protein docking experiment. Improvement in the structural features of generated model is an onerous process and generally achieved by paying time and computational cost. Herein, it is achieved by reconciliation of FoldX commands which takes less time in execution. Evaluations performed to validate structural parameters and stability of monomeric and dimeric Hpa2 attests to its quality. Analysis of interfacial residues, energy terms and RMSD values indicated a clear correlation between experimental and theoretical interface properties of the dimers, corroborating to the regime used for Hpa2 dimer generation. Structural information from the refined models was used for virtual screening of substrate-derived library and polyamines to achieve a new platform for developing A. baumannii inhibitory molecules. Molecules showing preferential binding at the dimer interface could be used as allosteric inhibitors. Binding of polyamines with model illustrated the same binding pattern as described experimentally in case of yeast Hpa2.  相似文献   

15.
16.
A cDNA library from tomato planta macho viroid (TPMV)-infected tomato was constructed. The library was screened at low stringency with a tobacco PR-R cDNA probe. An 832 bp cDNA from a mRNA present only in infected tissue was isolated. Nucleotide sequence showed high homology with the osmotin from both tobacco and tomato (NP24). This cDNA probably corresponds to the AP24 and P23 proteins previously described in tomato and induced upon fungal and viroid infection.  相似文献   

17.
The conservation of genomic organization of mammalian species has been of interest for its usefulness in characterizing the genetics of traits and diseases and as one tool for examining evolution. The recent rough draft sequencing of the mouse and human genomes provides the opportunity for more detailed analyses. The current study examines the extent of homology between human chromosome 20 and the mouse genome by comparing putative coding and non-coding sequence to provide insight into organizational and sequence similarities between the species. The relative position of each of 460 putative coding orthologues was the same in both species, except for a single genomic segment rearrangement. The similarity extended to exon/intron structure, the size of introns, as well as strong evidence for the conservation of position of ancient LINE-1, LINE-2 and LTR repetitive sequence and the subtelomeric region of the long arm of human chromosome 20 and that of mouse chromosome 2. There was also evidence for conservation of a limited amount of non-coding single-copy sequence. Together these data provide additional insight into the extent of conservation of mammalian genomic organization and sequence.  相似文献   

18.
《Gene》1996,172(1):161-162
A 1170-nucleotide fragment of φLf DNA was sequenced. This fragment contains an open reading frame, ORF367, encoding a protein of 367 amino acids (aa) (36710 Da). ORF367 is located downstream from the gene encoding the major coat protein (gVIIIp) and a Rho-independent termination signal. Sequence analysis revealed that the gene product has a Gly-rich domain (70 aa) at the center and a hydrophobic region (26 aa) at the C terminus. These structural features suggest that ORF367 may encode the adsorption protein of φLf.  相似文献   

19.
Many protein and peptide sequences are self-assembled into β-sheet-rich fibrous structures called amyloids. Their atomic details provide insights into fundamental knowledge related to amyloid diseases. To study the detailed structure of the amyloid, we have developed a model system that mimics the self-assembling process of the amyloid within a water-soluble protein, termed peptide self-assembly mimic (PSAM). PSAM enables capturing of a peptide sequence within a water-soluble protein, thus making structural and energetics-related studies possible. In this work, we extend our PSAM approach to a naturally occurring chameleon sequence from αB crystallin. We chose “Val–Leu–Gly–Asp–Val (VLGDV)”, a five amino-acid sequence, which forms a β-turn in the native structure and a β-barrel in the amyloid oligomer cylindrin, as a grafting sequence to the PSAM scaffold. The crystal structure revealed that the sequence grafting induced β-sheet bending at the grafted site. We further investigated the role of the central glycine residue and found that its role in the β-sheet bending is dependent on the neighboring residues. The ability of PSAM to observe the structural alterations induced by the grafted sequence provides an opportunity to evaluate the structural impact of a sequence from the peptide self-assembly.  相似文献   

20.
Automated Edman degradation of reduced and carboxymethylated phospholipase A2-α from Crotalus adamanteus venom revealed a single amino acid sequence extending 30 residues into the protein from the amino terminus. The singularity of the sequence and the yields of the phenylthiohydantoin amino acids thus obtained indicate that the subunits comprising the phospholipase dimer are identical. Further chemical evidence in support of subunit identity was obtained by cleavage of phospholipase A2-α with cyanogen bromide. Compositional analysis of the protein revealed one residue of methionine per monomer and the sequence determination placed this amino acid at position 10 in the sequence of 133 amino acids. Cyanogen bromide cleavage of the protein, followed by reduction and carboxymethylation afforded the expected 2 fragments: an NH2-terminal decapeptide (CNBr-1) and a larger COOH-terminal fragment of 123 residues (CNBr-II). Automated Edman degradation of the latter has extended the sequence analysis to 54 residues in the NH2-terminal segment of the monomer chain. Comparison of this sequence with those derived for phospholipases from other snake venoms, from bee venom, and from porcine pancreas has revealed striking homologies in this region of the molecules. As expected on the basis of their phylogenetic classification, the phospholipases from the pit vipers C. adamanteus and Agkistrodon halys blomhoffii are more similar to one another in sequence than to the enzyme from the more distantly related viper, Bitis gabonica. Furthermore, the very close similarities in sequence observed among all of these phospholipases in regions corresponding to residues 24 through 53 in the C. adamanteus enzyme suggest that this segment of the polypeptide plays an important role in phospholipase function and probably constitutes part of the active site.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号