首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
细菌外毒素序列中特有模体的识别及其基因本体注释分析   总被引:1,自引:0,他引:1  
【目的】识别细菌外毒素序列中特有模体,进一步理解外毒素的致病机制。【方法】构建非致病性细菌蛋白质数据库,利用InterProScan对数据库中非致病菌蛋白质序列以及收集的经实验确认的89条细菌外毒素蛋白质序列进行模体搜索。【结果】在89条细菌外毒素序列中,分析得到了39个细菌外毒素特有模体。【结论】得到的外毒素特有模体与外毒素功能密切相关,为在致病性细菌基因组内搜索外毒素序列奠定了基础;同时通过对外毒素特有模体的基因本体(Gene ontology,GO)注释分析,进一步阐明了细菌外毒素的致病机制。  相似文献   

4.
The TOPDOM database is a collection of domains and sequence motifs located consistently on the same side of the membrane in alpha-helical transmembrane proteins. The database was created by scanning well-annotated transmembrane protein sequences in the UniProt database by specific domain or motif detecting algorithms. The identified domains or motifs were added to the database if they were uniformly annotated on the same side of the membrane of the various proteins in the UniProt database. The information about the location of the collected domains and motifs can be incorporated into constrained topology prediction algorithms, like HMMTOP, increasing the prediction accuracy. AVAILABILITY: The TOPDOM database and the constrained HMMTOP prediction server are available on the page http://topdom.enzim.hu CONTACT: tusi@enzim.hu; lkalmar@enzim.hu.  相似文献   

5.
In addition to large domains, many short motifs mediate functional post-translational modification of proteins as well as protein-protein interactions and protein trafficking functions. We have constructed a motif database comprising 312 unique motifs and a web-based tool for identifying motifs in proteins. Functional motifs predicted by MnM can be ranked by several approaches, and we validated these scores by analyzing thousands of confirmed examples and by confirming prediction of previously unidentified 14-3-3 motifs in EFF-1.  相似文献   

6.
The iProClass database is an integrated resource that provides comprehensive family relationships and structural and functional features of proteins, with rich links to various databases. It is extended from ProClass, a protein family database that integrates PIR superfamilies and PROSITE motifs. The iProClass currently consists of more than 200,000 non-redundant PIR and SWISS-PROT proteins organized with more than 28,000 superfamilies, 2600 domains, 1300 motifs, 280 post-translational modification sites and links to more than 30 databases of protein families, structures, functions, genes, genomes, literature and taxonomy. Protein and family summary reports provide rich annotations, including membership information with length, taxonomy and keyword statistics, full family relationships, comprehensive enzyme and PDB cross-references and graphical feature display. The database facilitates classification-driven annotation for protein sequence databases and complete genomes, and supports structural and functional genomic research. The iProClass is implemented in Oracle 8i object-relational system and available for sequence search and report retrieval at http://pir.georgetown.edu/iproclass/.  相似文献   

7.
F-box proteins constitute a large family in eukaryotes and are characterized by a conserved F-box motif (approximately 40 amino acids). As components of the Skp1p-cullin-F-box complex, F-box proteins are critical for the controlled degradation of cellular proteins. We have identified 687 potential F-box proteins in rice (Oryza sativa), the model monocotyledonous plant, by a reiterative database search. Computational analysis revealed the presence of several other functional domains, including leucine-rich repeats, kelch repeats, F-box associated domain, domain of unknown function, and tubby domain in F-box proteins. Based upon their domain composition, they have been classified into 10 subfamilies. Several putative novel conserved motifs have been identified in F-box proteins, which do not contain any other known functional domain. An analysis of a complete set of F-box proteins in rice is presented, including classification, chromosomal location, conserved motifs, and phylogenetic relationship. It appears that the expansion of F-box family in rice, in large part, might have occurred due to localized gene duplications. Furthermore, comprehensive digital expression analysis of F-box protein-encoding genes has been complemented with microarray analysis. The results reveal specific and/or overlapping expression of rice F-box protein-encoding genes during floral transition as well as panicle and seed development. At least 43 F-box protein-encoding genes have been found to be differentially expressed in rice seedlings subjected to different abiotic stress conditions. The expression of several F-box protein-encoding genes is also influenced by light. The structure and function of F-box proteins in plants is discussed in light of these results and the published information. These data will be useful for prioritization of F-box proteins for functional validation in rice.  相似文献   

8.
9.
SUMMARY: The database of structural motifs in proteins (DSMP) contains data relevant to helices, beta-turns, gamma-turns, beta-hairpins, psi-loops, beta-alpha-beta motifs, beta-sheets, beta-strands and disulphide bridges extracted from all proteins in the Protein Data Bank primarily using the PROMOTIF program and implemented as a web-based network service using the SRS. The data corresponding to the structural motifs includes; sequence, position in polypeptide chain, geometry, type, unique code, keywords and resolution of crystal structure. This data is available for a representative data set of 1028 protein chains and also for all 10 213 proteins in the Protein Data Bank. The three-dimensional coordinates for all structural motifs (except sheet and disulphide bridge) are also available for the representative data set. Using features in SRS, DSMP can be queried to extract information from one or more structural motifs that may be useful for sequence-structure analysis, prediction, modelling or design. AVAILABILITY: http://www. cdfd.org.in/dsmp.html  相似文献   

10.
We present the first release of a database devoted to the ATP-binding cassette (ABC) protein domains (ABCdb). The ABC proteins are involved in a wide variety of physiological processes in Archea, Bacteria and Eucaryota where they are encoded by large families of paralogous genes. The majority of ABC domains energize the transport of compounds across the membranes. In bacteria, ABC transporters are involved in the uptake of a wide range of molecules and in mechanisms of virulence and antibiotic resistance. In eukaryotes, most of them are involved in drug resistance and in human cells, many are associated with diseases. Sequence analysis reveals that members of the ABC superfamily can be organized into sub-families and suggests that they have diverged from common ancestral forms. In this release, ABCdb includes the inventory and assembly of the ABC transporter systems of completely sequenced genomes. In addition to the protein entries, the database comprises information on functional domains, sequence motifs, predicted trans-membrane segments, and signal peptides. It also includes a classification in sub-families of the ABC systems as well as a classification of the different partners of the systems. Evolutionary trees and specific sequence patterns are provided for each sub-family. The database is endowed with a powerful query system and it was interfaced with blastP2 program for similarity searches. ABCdb has been developed in the ACeDB format, a database system developed by Jean Thierry-Mieg and Richard Durbin. ABCdb can be accessed via the World Wide Web (http://ir2lcb.cnrs-mrs.fr/ABCdb/).  相似文献   

11.
12.
Prediction of short linear protein binding regions   总被引:1,自引:0,他引:1  
Short linear motifs in proteins (typically 3-12 residues in length) play key roles in protein-protein interactions by frequently binding specifically to peptide binding domains within interacting proteins. Their tendency to be found in disordered segments of proteins has meant that they have often been overlooked. Here we present SLiMPred (short linear motif predictor), the first general de novo method designed to computationally predict such regions in protein primary sequences independent of experimentally defined homologs and interactors. The method applies machine learning techniques to predict new motifs based on annotated instances from the Eukaryotic Linear Motif database, as well as structural, biophysical, and biochemical features derived from the protein primary sequence. We have integrated these data sources and benchmarked the predictive accuracy of the method, and found that it performs equivalently to a predictor of protein binding regions in disordered regions, in addition to having predictive power for other classes of motif sites such as polyproline II helix motifs and short linear motifs lying in ordered regions. It will be useful in predicting peptides involved in potential protein associations and will aid in the functional characterization of proteins, especially of proteins lacking experimental information on structures and interactions. We conclude that, despite the diversity of motif sequences and structures, SLiMPred is a valuable tool for prioritizing potential interaction motifs in proteins.  相似文献   

13.
We applied an automatic and unsupervised system to a nearly complete database of mammalian odor receptor genes. The generated motifs and gene classification were subjected to extensive and systematic downstream analysis to obtain biological insights. Two major results from this analysis were: (1) a map of sequence motifs that may correlate with function and (2) the corresponding receptor classes in which members of each class are likely to share specific functions. We have discovered motifs that have been implicated in structural integrity and posttranslational modification, as well as motifs very likely to be directly involved in ligand binding. We further propose a combinatorial molecular hypothesis, based on unique combinations of the observed motifs, that provides a foundation for understanding the generation of a large number of ligand binding sites.  相似文献   

14.
PRINTS-S: the database formerly known as PRINTS   总被引:10,自引:0,他引:10  
The PRINTS database houses a collection of protein family fingerprints. These are groups of motifs that together are diagnostically more potent than single motifs by virtue of the biological context afforded by matching motif neighbours. Around 1200 fingerprints have now been created and stored in the database. The September 1999 release (version 24.0) encodes approximately 7200 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. In addition to its continued steady growth, we report here several major changes to the resource, including the design of an automated strategy for database maintenance, and implementation of an object-relational schema for more efficient data management. The database is accessible for BLAST, fingerprint and text searches at http://www.bioinf.man.ac. uk/dbbrowser/PRINTS/  相似文献   

15.
Chromosomal translocations involving genes coding for members of the HMG-I(Y) family of "high mobility group" non-histone chromatin proteins (HMG-I, HMG-Y, and HMG-IC) have been observed in numerous types of human tumors. Many of these gene rearrangements result in the creation of chimeric proteins in which the DNA-binding domains of the HMG-I(Y) proteins, the so-called A.T-hook motifs, have been fused to heterologous peptide sequences. Although little is known about either the structure or biophysical properties of these naturally occurring fusion proteins, the suggestion has been made that such chimeras have probably assumed an altered in vivo DNA-binding specificity due to the presence of the A.T-hook motifs. To investigate this possibility, we performed in vitro "domain-swap" experiments using a model protein fusion system in which a single A. T-hook peptide was exchanged for a corresponding length peptide in the well characterized "B-box" DNA-binding domain of the HMG-1 non-histone chromatin protein. Here we report that chimeric A. T-hook/B-box hybrids exhibit in vitro DNA-binding characteristics resembling those of wild type HMG-I(Y) protein, rather than the HMG-1 protein. These results strongly suggest that the chimeric fusion proteins produced in human tumors as a result of HMG-I(Y) gene chromosomal translocations also retain A.T-hook-imparted DNA-binding properties in vivo.  相似文献   

16.
Domain database is essential for domain property research. Eliminating redundant information in database query is very important for database quality. Here we report the manual construction of a non-redundant human SH2 domain database. There are 119 human SH2 domains in 110 SH2-containing proteins. Human SH2s were aligned with ClustalX, and a homologous tree was generated. In this tree, proteins with similar known function were classified into the same group. Some proteins in the same group have been reported to have similar binding motifs experimentally. The tree might provide clues about possible functions of hypothetical proteins for further experimental verification.  相似文献   

17.
18.
19.
Do LH  Bier E 《Bioinformation》2011,6(2):83-85
Redundancy among sequence identifiers is a recurring problem in bioinformatics. Here, we present a rapid and efficient method of fingerprinting identifiers to ascertain whether two or more aliases are identical. A number of tools and approaches have been developed to resolve differing names for the same genes and proteins, however, these methods each have their own limitations associated with their various goals. We have taken a different approach to the aliasing problem by simplifying the way aliases are stored and curated with the objective of simultaneously achieving speed and flexibility. Our approach (Booly-hashing) is to link identifiers with their corresponding hash keys derived from unique fingerprints such as gene or protein sequences. This tool has proven invaluable for designing a new data integration platform known as Booly, and has wide applicability to situations in which a dedicated efficient aliasing system is required. Compared with other aliasing techniques, Booly-hashing methodology provides 1) reduced run time complexity, 2) increased flexibility (aliasing of other data types, e.g. pharmaceutical drugs), 3) no required assumptions regarding gene clusters or hierarchies, and 4) simplicity in data addition, updating, and maintenance. The new Booly-hashing aliasing model has been incorporated as a central component of the Booly data integration platform we have recently developed and shoud be broadly applicable to other situations in which an efficient streamlined aliasing systems is required. This aliasing tool and database, which allows users to quickly group the same genes and proteins together can be accessed at: http://booly.ucsd.edu/alias. AVAILABILITY: The database is available for free at http://booly.ucsd.edu/alias.  相似文献   

20.
Rice is not only a major food staple for the world's population but it also is a model species for a major group of flowering plants, the monocotyledonous plants. Draft genomic sequence of two subspecies of rice, Oryza sativa spp. japonica and indica ssp. are publicly available. To provide the community with a resource to data-mine the rice genome, we have constructed an annotation resource for rice (http://www.tigr.org/tdb/e2k1/osa1/). In this resource, we have annotated the rice genome for gene content, identified motifs/domains within the predicted genes, constructed a rice repeat database, identified related sequences in other plant species, and identified syntenic sequences between rice and maize. All of the data is available through web-based interfaces, FTP downloads, and a Distributed Annotation System.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号