首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
The PIR-International Protein Sequence Database.   总被引:1,自引:0,他引:1       下载免费PDF全文
From its origin the Protein Sequence Database has been designed to support research and has focused on comprehensive coverage, quality control and organization of the data in accordance with biological principles. Since 1988 the database has been maintained collaboratively within the framework of PIR-International, an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. The database is widely distributed and is available on the World Wide Web, via ftp, email server, on CD-ROM and magnetic media. It is widely redistributed and incorporated into many other protein sequence data compilations, including SWISS-PROT and the Entrez system of the NCBI.  相似文献   

The PIR-International Protein Sequence Database.   总被引:1,自引:1,他引:0       下载免费PDF全文
PIR-International is an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. A major objective of PIR-International is to continue the development of the Protein Sequence Database as an essential public resource for protein sequence information. This paper briefly describes the architecture of the Protein Sequence Database and how it and associated data sets are distributed and can be accessed electronically.  相似文献   

The PIR-International databases.   总被引:11,自引:8,他引:3       下载免费PDF全文
PIR-International is an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. PIR-International is most noted for the Protein Sequence Database. This database originated in the early 1960's with the pioneering work of the late Margaret Dayhoff as a research tool for the study of protein evolution and intersequence relationships; it is maintained as a scientific resource, organized by biological concepts, using sequence homology as a guiding principle. PIR-International also maintains a number of other genomic, protein sequence, and sequence-related databases. The databases of PIR-International are made widely available. This paper briefly describes the architecture of the Protein Sequence Database, a number of other PIR-International databases, and mechanisms for providing access to and for distribution of these databases.  相似文献   

Universal scale of the sequence conservation has been recently introduced based on omnipresence of the protein sequence motifs across species. A large spectrum of short sequences, up to eight residues has been found to reside in all or almost all prokaryotic organisms. By this discovery a principally novel quantitative approach is introduced to the problem of reconstruction of the last universal common ancestor (LUCA). The most conserved elements (protein modules) with defined structures and sequences harboring the omnipresent motifs are outlined in this work, by combining the sequence and protein crystal structure data. The structurally conserved modules involve 25–30 amino acid residues and have appearance of closed loops, loop-n-lock structures. This confirms earlier conclusions on the loop-fold structure of globular proteins. Many of the topmost conserved modules represent the primary closed loop prototypes, that have been derived by whole genome sequence searches. The data presented, thus, make a basis for further developments toward the earliest stages of protein evolution. [Reviewing Editor: Dr. Martin Kreitman]  相似文献   

The three-dimensional structure of a protein molecule appears to depend on the amino acid sequence of the protein in an as yet incompletely described manner. If the amino acid sequence is replaced by a numerical sequence of values representing a physical or chemical property of amino acids, the resulting numerical sequence is amenable to autocorrelation analysis. Further, if certain geometrical parameters are calculated from the three-dimensional structure of a protein to form a configurational series, pairs of property series and configurational series can be analyzed by cross-correlation techniques. The data base for the analysis was the three-dimensional structures of ten proteins as determined by X-ray crystallography. Such analysis yields the result that the hydrophobicity of an amino acid residue in a protein influences the orientation angle of the amino acid side chain. This result is consistent with the widely current “oil-drop” model of protein structure. Hydrophobicity also appears to influence the backbone dihedral angle φ, but not ψ Such a directional effect cannot be explained by a current model of information transfer in protein helices. The magnitude of the cross correlations does not appear to be satisfactory for construction of a transfer function model for the prediction of general features of protein structure from amino acid sequences.  相似文献   

SUMMARY: We present PANAL, an integrated resource for protein sequence analysis. The tool allows the user to simultaneously search a protein sequence for motifs from several databases, and to view the result as an intuitive graphical summary.  相似文献   

The Protein Information Resource (PIR) is an integrated public resource of protein informatics that supports genomic and proteomic research and scientific discovery. PIR maintains the Protein Sequence Database (PSD), an annotated protein database containing over 283 000 sequences covering the entire taxonomic range. Family classification is used for sensitive identification, consistent annotation, and detection of annotation errors. The superfamily curation defines signature domain architecture and categorizes memberships to improve automated classification. To increase the amount of experimental annotation, the PIR has developed a bibliography system for literature searching, mapping, and user submission, and has conducted retrospective attribution of citations for experimental features. PIR also maintains NREF, a non-redundant reference database, and iProClass, an integrated database of protein family, function, and structure information. PIR-NREF provides a timely and comprehensive collection of protein sequences, currently consisting of more than 1 000 000 entries from PIR-PSD, SWISS-PROT, TrEMBL, RefSeq, GenPept, and PDB. The PIR web site (http://pir.georgetown.edu) connects data analysis tools to underlying databases for information retrieval and knowledge discovery, with functionalities for interactive queries, combinations of sequence and text searches, and sorting and visual exploration of search results. The FTP site provides free download for PSD and NREF biweekly releases and auxiliary databases and files.  相似文献   

通过体外操作,对豇豆胰蛋白酶抑制剂(cpti)基因进行修饰,获得了一个融合蛋白基因(sck).该基因是在cpti基因的基础上,在其5'端添加了信号肽编码序列,在3'端添加了内质网滞留信号编码序列,旨在引导基因转译产物进入细胞内质网,并最终滞留在内质网及其衍生的蛋白体内.用sck基因转化烟草(Nicotiana tabacum L.),对获得的转基因植株进行ELISA检测.结果表明,含有修饰基因的转基因烟草CpTI蛋白含量有明显提高,比转未修饰cpti基因烟草平均高出2倍,最高单株可达4倍以上,同时转基因植株的抗虫性也有了显著的提高.结果表明,采用外源蛋白靶向定位的策略,可大幅度提高外源蛋白在转基因植物细胞内的积累量,在植物基因工程研究中具有广泛的借鉴意义.  相似文献   

通过体外操作,对豇豆胰蛋白酶抑制剂(cpti)基因进行修饰,获得了一个融合蛋白基因(sck)。该基因是在cpti基因的基础上,在其5’端添加了信号肽编码序列,在3’端添加了内质网滞留信号编码序列,旨在引导基因转译产物进入细胞内质网,并最终滞留在内质网及其衍生的蛋白体内。用sck基因转化烟草(Nicotiana tabacum L.),对获得的转基因植株进行ELISA检测。结果表明,含有修饰基因的转基因烟草CpTI蛋白含量有明显提高,比转末修饰cpti基因烟草平均高出2倍,最高单株可达4倍以上,同时转基因植株的抗虫性也有了显著的提高。结果表明,采用外源蛋白靶向定位的策略,可大幅度提高外源蛋白在转基因植物细胞内的积累量,在植物基因工程研究中具有广泛的借鉴意义。  相似文献   

Protein structure alignment   总被引:22,自引:0,他引:22  
A new method of comparing protein structures is described, based on distance plot analysis. It is relatively insensitive to insertions and deletions in sequence and is tolerant of the displacement of equivalent substructures between the two molecules being compared. When presented with the co-ordinate sets of two structures, the method will produce automatically an alignment of their sequences based on structural criteria. The method uses the dynamic programming optimization technique, which is widely used in the comparison of protein sequences and thus unifies the techniques of protein structure and sequence comparison. Typical structure comparison problems were examined and the results of the new method compared to the published results obtained using conventional methods. In most examples, the new method produced a result that was equivalent, and in some cases superior, to those reported in the literature.  相似文献   

Starting from experimental data on sequence, structure or biochemical properties of enzymes, protein design seeks to construct enzymes with desired activity, stability, specificity and selectivity. Two strategies are widely used to investigate sequence-structure-function relationships: statistical methods to analyse protein families or mutant libraries, and molecular modelling methods to study proteins and their interaction with ligands or substrates. On the basis of these methods, protein design has been successfully applied to fine-tune bottleneck enzymes in metabolic engineering and to design enzymes with new substrate spectra and new functions. However, constructing efficient metabolic pathways by integrating individual enzymes into a complex system is challenging. The field of synthetic biology is still in its infancy, but promising results have demonstrated the feasibility and usefulness of the concept.  相似文献   

Protein kinases are the most common protein domains implicated in cancer, where somatically acquired mutations are known to be functionally linked to a variety of cancers. Resequencing studies of protein kinase coding regions have emphasized the importance of sequence and structure determinants of cancer-causing kinase mutations in understanding of the mutation-dependent activation process. We have developed an integrated bioinformatics resource, which consolidated and mapped all currently available information on genetic modifications in protein kinase genes with sequence, structure and functional data. The integration of diverse data types provided a convenient framework for kinome-wide study of sequence-based and structure-based signatures of cancer mutations. The database-driven analysis has revealed a differential enrichment of SNPs categories in functional regions of the kinase domain, demonstrating that a significant number of cancer mutations could fall at structurally equivalent positions (mutational hotspots) within the catalytic core. We have also found that structurally conserved mutational hotspots can be shared by multiple kinase genes and are often enriched by cancer driver mutations with high oncogenic activity. Structural modeling and energetic analysis of the mutational hotspots have suggested a common molecular mechanism of kinase activation by cancer mutations, and have allowed to reconcile the experimental data. According to a proposed mechanism, structural effect of kinase mutations with a high oncogenic potential may manifest in a significant destabilization of the autoinhibited kinase form, which is likely to drive tumorigenesis at some level. Structure-based functional annotation and prediction of cancer mutation effects in protein kinases can facilitate an understanding of the mutation-dependent activation process and inform experimental studies exploring molecular pathology of tumorigenesis.  相似文献   

Human Protein Reference Database (HPRD) is a rich resource of experimentally proven features of human proteins. Protein information in HPRD includes protein-protein interactions, post-translational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization of human proteins. Although, protein-protein interaction data from HPRD has been widely used by the scientific community, its phosphoproteome data has not been exploited to its full potential. HPRD is one of the largest documentations of human phosphoproteins in the public domain. Currently, phosphorylation data in HPRD comprises of 95,016 phosphosites mapped on to 13,041 proteins. Additionally, enzyme-substrate reactions responsible for 5930 phosphorylation events were also documented. Significant improvements in technologies and high-throughput platforms in biomedical investigations led to an exponential increase of biological data and phosphoproteomic data in recent years. Human Proteinpedia, a community annotation portal developed by us, has also contributed to the significant increase in phosphoproteomic data in HPRD. A large number of phosphorylation events have been mapped on to reference sequences available in HPRD and Human Proteinpedia along with associated protein features. This will provide a platform for systems biology approaches to determine the role of protein phosphorylation in protein function, cell signaling, biological processes and their implication in human diseases. This review aims to provide a composite view of phosphoproteomic data pertaining to human proteins in HPRD and Human Proteinpedia.  相似文献   

Numerous proteins initiate their folding, localization, and modifications early during translation, and emerging data show that the ribosome actively participates in diverse protein biogenesis pathways. Here we show that the ribosome imposes an additional layer of substrate selection during N-terminal methionine excision (NME), an essential protein modification in bacteria. Biochemical analyses show that cotranslational NME is exquisitely sensitive to a hydrophobic signal sequence or transmembrane domain near the N terminus of the nascent polypeptide. The ability of the nascent chain to access the active site of NME enzymes dictates NME efficiency, which is inhibited by confinement of the nascent chain on the ribosome surface and exacerbated by signal recognition particle. In vivo measurements corroborate the inhibition of NME by an N-terminal hydrophobic sequence, suggesting the retention of formylmethionine on a substantial fraction of the secretory and membrane proteome. Our work demonstrates how molecular features of a protein regulate its cotranslational modification and highlights the active participation of the ribosome in protein biogenesis pathways via interactions of the ribosome surface with the nascent protein.  相似文献   

The Protein Circular Dichroism Data Bank (PCDDB) [https://pcddb.cryst.bbk.ac.uk] is an established resource for the biological, biophysical, chemical, bioinformatics, and molecular biology communities. It is a freely-accessible repository of validated protein circular dichroism (CD) spectra and associated sample and metadata, with entries having links to other bioinformatics resources including, amongst others, structure (PDB), AlphaFold, and sequence (UniProt) databases, as well as to published papers which produced the data and cite the database entries. It includes primary (unprocessed) and final (processed) spectral data, which are available in both text and pictorial formats, as well as detailed sample and validation information produced for each of the entries. Recently the metadata content associated with each of the entries, as well as the number and structural breadth of the protein components included, have been expanded. The PCDDB includes data on both wild-type and mutant proteins, and because CD studies primarily examine proteins in solution, it also contains examples of the effects of different environments on their structures, plus thermal unfolding/folding series. Methods for both sequence and spectral comparisons are included.The data included in the PCDDB complement results from crystal, cryo-electron microscopy, NMR spectroscopy, bioinformatics characterisations and classifications, and other structural information available for the proteins via links to other databases. The entries in the PCDDB have been used for the development of new analytical methodologies, for interpreting spectral and other biophysical data, and for providing insight into structures and functions of individual soluble and membrane proteins and protein complexes.  相似文献   

Whitmore L  Janes RW  Wallace BA 《Chirality》2006,18(6):426-429
The Protein Circular Dichroism Data Bank (PCDDB) is a new deposition data bank for validated circular dichroism spectra of biomacromolecules. Its aim is to be a resource for the structural biology and bioinformatics communities, providing open access and archiving facilities for circular dichroism and synchrotron radiation circular dichroism spectra. It is named in parallel with the Protein Data Bank (PDB), a long-existing valuable reference data bank for protein crystal and NMR structures. In this article, we discuss the design of the data bank structure and the deposition website located at http://pcddb.cryst.bbk.ac.uk. Our aim is to produce a flexible and comprehensive archive, which enables user-friendly spectral deposition and searching. In the case of a protein whose crystal structure and sequence are known, the PCDDB entry will be linked to the appropriate PDB and sequence data bank files, respectively. It is anticipated that the PCDDB will provide a readily accessible biophysical catalogue of information on folded proteins that may be of value in structural genomics programs, for quality control and archiving in industrial and academic labs, as a resource for programs developing spectroscopic structural analysis methods, and in bioinformatics studies.  相似文献   

夏彬彬  王军 《生物工程学报》2021,37(11):3863-3879
随着蛋白质序列及结构数据的大量累积,在获得了大量描述性信息之后如何有效利用海量数据,从已有数据中高效提取信息并且应用到下游任务当中就成为了研究者亟待解决的问题。蛋白质的设计可使新蛋白的研发不再受限于实验条件,这对药物靶点预测、新药研发和材料设计等领域具有重要意义。深度学习作为一种高效的数据特征提取方法,可以通过它对蛋白质数据进行建模,进而加入先验信息对蛋白质进行设计。故此基于深度学习的蛋白质设计就成为一个具有广阔前景的研究领域。文中主要阐述基于深度学习的蛋白质序列与结构数据的建模和设计方法。详述该方法的策略、原理、适用范围、应用实例。讨论了深度学习方法在本领域的应用前景及局限性,以期为相关研究提供参考。  相似文献   

Protein sequence databases   总被引:2,自引:0,他引:2  
A variety of protein sequence databases exist, ranging from simple sequence repositories, which store data with little or no manual intervention in the creation of the records, to expertly curated universal databases that cover all species and in which the original sequence data are enhanced by the manual addition of further information in each sequence record. As the focus of researchers moves from the genome to the proteins encoded by it, these databases will play an even more important role as central comprehensive resources of protein information. Several the leading protein sequence databases are discussed here, with special emphasis on the databases now provided by the Universal Protein Knowledgebase (UniProt) consortium.  相似文献   

蛋白质结构的预测在理解蛋白质结构组成和蛋白质的生物学功能有重要意义,而蛋白质二级结构预测是蛋白质结构预测的重要环节。当PSSM位置特异性进化矩阵被广泛应用于将蛋白质初级结构序列编码作为输入样本后,每个残基可以被表示成二维空间的数据平面,由此文中尝试利用卷积神经网络对其进行训练。文中还设计了另一种卷积神经网络,利用长短记忆网络感知了CNN最后卷积特征面的横向特征和纵向特征后连同卷积神经网络的全连接共同完成分类,最后用ensemble方法对两类卷积神经网络模型进行了整合,最终ensemble方法中包含两类卷积神经网络的六个模型,在CB513蛋白质数据集测得的Q3结果为77.2。  相似文献   

内质网膜蛋白在参与信号序列的识别、新生肽链的修饰、转运通道的形成等生理过程中发挥重要作用.易位子相关蛋白(translocon-associated protein, TRAP)是广泛存在于高等真核生物中的一种膜蛋白,其作为信号序列的受体蛋白位于内质网膜上.该蛋白能选择性地识别信号序列,并与Sec61相互作用形成一个以Sec61为核心、TRAP侧向延伸的椭圆状转运通道,从而靶向新生肽链进入内质网腔.近来研究发现,TRAP与蛋白质构象病、神经退行性疾病、肿瘤转移等疾病的发病机制有关.本文将对TRAP各个亚基的最新研究及其功能作一综述.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号