首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Enhanced genome annotation using structural profiles in the program 3D-PSSM   总被引:31,自引:0,他引:31  
A method (three-dimensional position-specific scoring matrix, 3D-PSSM) to recognise remote protein sequence homologues is described. The method combines the power of multiple sequence profiles with knowledge of protein structure to provide enhanced recognition and thus functional assignment of newly sequenced genomes. The method uses structural alignments of homologous proteins of similar three-dimensional structure in the structural classification of proteins (SCOP) database to obtain a structural equivalence of residues. These equivalences are used to extend multiply aligned sequences obtained by standard sequence searches. The resulting large superfamily-based multiple alignment is converted into a PSSM. Combined with secondary structure matching and solvation potentials, 3D-PSSM can recognise structural and functional relationships beyond state-of-the-art sequence methods. In a cross-validated benchmark on 136 homologous relationships unambiguously undetectable by position-specific iterated basic local alignment search tool (PSI-Blast), 3D-PSSM can confidently assign 18 %. The method was applied to the remaining unassigned regions of the Mycoplasma genitalium genome and an additional 13 regions were assigned with 95 % confidence. 3D-PSSM is available to the community as a web server: http://www.bmm.icnet.uk/servers/3dpssm Copyright 2000 Academic Press.  相似文献   

2.
Protein interaction in cells can be described at different levels. At a low interaction level, proteins function together in small, stable complexes and at a higher level, in sets of interacting complexes. All interaction levels are crucial for the living organism, and one of the challenges in proteomics is to measure the proteins at their different interaction levels. One common method for such measurements is immunoprecipitation followed by mass spectrometry (IP/MS), which has the potential to probe the different protein interaction forms. However, IP/MS data are complex because proteins, in their diverse interaction forms, manifest themselves in different ways in the data. Numerous bioinformatic tools for finding protein complexes in IP/MS data are currently available, but most tools do not provide information about the interaction level of the discovered complexes, and no tool is geared specifically to unraveling and visualizing these different levels. We present a new bioinformatic tool to explore IP/MS datasets for protein complexes at different interaction levels and show its performance on several real–life datasets. Our tool creates clusters that represent protein complexes, but unlike previous methods, it arranges them in a tree–shaped structure, reporting why specific proteins are predicted to build a complex and where it can be divided into smaller complexes. In every data analysis method, parameters have to be chosen. Our method can suggest values for its parameters and comes with adapted visualization tools that display the effect of the parameters on the result. The tools provide fast graphical feedback and allow the user to interact with the data by changing the parameters and examining the result. The tools also allow for exploring the different organizational levels of the protein complexes in a given dataset. Our method is available as GNU-R source code and includes examples at www.bdagroup.nl.  相似文献   

3.
SUMMARY: SPrCY is a web-accessible database which provides comparison of structure prediction results for the Saccharomyces cerevisiae genome. This web service offers the ability to search, analyze and compare the yeast structural predictions from sequence-only (Superfamily, PDBAA BLAST and Pfam) and sequence-structure-based (SAM-T02, 3D-PSSM, mGenTHREADER) methods. AVAILABILITY: The service is freely available via web at http://agave.wustl.edu/yeast/  相似文献   

4.
MOTIVATION: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). RESULTS: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.  相似文献   

5.
To assess the reliability of fold assignments to protein sequences, we developed a fold recognition method called FROST (Fold Recognition-Oriented Search Tool) based on a series of filters and a database specifically designed as a benchmark for this new method under realistic conditions. This benchmark database consists of proteins for which there exists, at least, another protein with an extensively similar 3D structure in a database of representative 3D structures (i.e., more than 65% of the residues in both proteins can be structurally aligned). Because the testing of our method must be carried out under conditions similar to those of real fold recognition experiments, no protein pair with sequence similarity detectable using standard sequence comparison methods such as FASTA is included in the benchmark database. While using FROST, we achieved a coverage of 60% for a rate of error of 1%. To obtain a baseline for our method, we used PSI-BLAST and 3D-PSSM. Under the same conditions, for a 1% error rate, coverages for PSI-BLAST and 3D-PSSM were 33 and 56%, respectively.  相似文献   

6.
Perry J 《Proteins》2005,61(4):699-703
Timeless (Tim) and Period (Per) are coordinately synthesized interacting proteins that in response to positional/environmental cues comigrate to the nucleus as obligate heterodimers where they act to suppress their own gene expression as part of the circadian rhythm network in Drosophila. There has been considerable controversy about the structural nature of Tim because of somewhat conflicting results generated by the automated threading algorithm 3D-PSSM. With use of a computer-assisted but largely manual approach, it is shown here that Tim is composed of repetitive structural elements and that those elements comprise two ARM repeat domains, validating the essence of the original 3D-PSSM prediction. Eleven individual ARM structural units are assigned, and a phylogenetic analysis showing their relatedness to one another is performed. In addition, there appears to be a small domain of prenyltransferase-like repeats C-terminal to the second ARM domain, which went undetected in previous analyses. Although we cannot know the precise conformation it adopts until a structure is solved, Tim emerges here as clearly a member of the helical repeat protein superfamily. Given its role in periodic nuclear translocation, Tim may, therefore, have a functional analogy with the photoperiod-responsive protein Phor1 and other karyopherin-like molecules.  相似文献   

7.
目的:电子克隆并分析泥鳅MnSOD基因.方法:通过电子克隆泥鳅MnSOD基因的开放阅读框,并对其氨基酸组成、功能结构域、系统进化、信号肽、二级结构、三级结构等信息进行预测分析.结果:通过电子克隆获得泥鳅MnSOD基因的cDNA.该基因的开放阅读框大小为675bp,编码224个氨基酸残基,推导的蛋白质分子量为25kDa,等电点约为8.41,其编码序列与其他物种相比非常保守,与鲤形目淡水鱼的MnSOD在进化上亲缘关系最近.其N端含转运肽,推测它定位于线粒体中.预测泥鳅MnSOD的二级结构及三级结构含有较多的不规则卷曲和α螺旋,其N端为两个长的反平行螺旋,C端则为含有拧成三股折叠片的α+β结构.结论:该研究为泥鳅进一步的分子生物学及抗氧化研究奠定了基础.  相似文献   

8.
SUMMARY: StrBioLib is a library of Java classes useful for developing software for computational structural biology research. StrBioLib contains classes to represent and manipulate protein structures, biopolymer sequences, sets of biopolymer sequences, and alignments between biopolymers based on either sequence or structure. Interfaces are provided to interact with commonly used bioinformatics applications, including (psi)-blast, modeller, muscle and Primer3, and tools are provided to read and write many file formats used to represent bioinformatic data. The library includes a general-purpose neural network object with multiple training algorithms, the Hooke and Jeeves non-linear optimization algorithm, and tools for efficient C-style string parsing and formatting. StrBioLib is the basis for the Pred2ary secondary structure prediction program, is used to build the astral compendium for sequence and structure analysis, and has been extensively tested through use in many smaller projects. Examples and documentation are available at the site below. AVAILABILITY: StrBioLib may be obtained under the terms of the GNU LGPL license from http://strbio.sourceforge.net/  相似文献   

9.
SUMMARY: With the continuous growth of the RCSB Protein Data Bank (PDB), providing an up-to-date systematic structure comparison of all protein structures poses an ever growing challenge. Here, we present a comparison tool for calculating both 1D protein sequence and 3D protein structure alignments. This tool supports various applications at the RCSB PDB website. First, a structure alignment web service calculates pairwise alignments. Second, a stand-alone application runs alignments locally and visualizes the results. Third, pre-calculated 3D structure comparisons for the whole PDB are provided and updated on a weekly basis. These three applications allow users to discover novel relationships between proteins available either at the RCSB PDB or provided by the user. Availability and Implementation: A web user interface is available at http://www.rcsb.org/pdb/workbench/workbench.do. The source code is available under the LGPL license from http://www.biojava.org. A source bundle, prepared for local execution, is available from http://source.rcsb.org CONTACT: andreas@sdsc.edu; pbourne@ucsd.edu.  相似文献   

10.
A global view of CK2 function and regulation   总被引:7,自引:0,他引:7  
  相似文献   

11.
染色体的空间交互作用被视为影响基因表达调控的重要因素,高通量染色体构象捕获(high-throughput chromosome conformation capture,Hi-C)技术已成为3D基因组学中探索染色体空间交互作用的主要实验手段之一。随着Hi-C样本数据的持续累积以及分析处理流程复杂度的不断提升,基于生物信息学的Hi-C数据分析对探究基因表达的时空调控机制而言,是机遇也是挑战。本文从生物信息学角度,综合阐述了Hi-C的国内外研究现状及发展动态,包括数据标准化、多级结构分析、数据可视化以及三维建模,重点剖析了多级结构中的A/B区室(A/B compartments)、拓扑相关域(topological associated domains,TADs)和染色质环(chromain looping),在此基础上分析了该方向未来可能的研究热点及发展趋势,以期为将基因表达调控的探索从传统线性空间进一步拓展到三维结构空间提供支持。  相似文献   

12.
Con-Struct Map is a graphical tool for the comparative study of protein structures. The tool detects potential conserved residue contacts shared by multiple protein structures by superimposing their contact maps according to a multiple structure alignment. In general, Con-Struct Map allows the study of structural changes resulting from, e.g. sequence substitutions, or alternatively, the study of conserved components of a structure framework across structurally aligned proteins. Specific applications include the study of sequence-structure relationship in distantly related proteins and the comparisons of wild type and mutant proteins. AVAILABILITY: http://pdbrs3.sdsc.edu/ConStructMap/viewer_argument_generator/singleArguments. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

13.
A huge number of high-quality predicted protein structures are now publicly available. However, many of these structures contain non-globular regions, which diminish the performance of downstream structural bioinformatic applications. In this study, we develop AlphaCutter for the removal of non-globular regions from predicted protein structures. A large-scale cleaning of 542,380 predicted SwissProt structures highlights that AlphaCutter is able to (1) remove non-globular regions that are undetectable using pLDDT scores and (2) preserve high integrity of the cleaned domain regions. As useful applications, AlphaCutter improved the folding energy scores and sequence recovery rates in the re-design of domain regions. On average, AlphaCutter takes less than 3 s to clean a protein structure, enabling efficient cleaning of the exploding number of predicted protein structures. AlphaCutter is available at https://github.com/johnnytam100/AlphaCutter . AlphaCutter-cleaned SwissProt structures are available for download at https://doi.org/10.5281/zenodo.7944483 .  相似文献   

14.
Constructing multiple homologous alignments for protein-coding DNA sequences is crucial for a variety of bioinformatic analyses but remains computationally challenging. With the growing amount of sequence data available and the ongoing efforts largely dependent on protein-coding DNA alignments, there is an increasing demand for a tool that can process a large number of homologous groups and generate multiple protein-coding DNA alignments. Here we present a parallel tool - ParaAT that is capable of parallelly constructing multiple protein-coding DNA alignments for a large number of homologs. As testified on empirical datasets, ParaAT is well suited for large-scale data analysis in the high-throughput era, providing good scalability and exhibiting high parallel efficiency for computationally demanding tasks. ParaAT is freely available for academic use only at http://cbb.big.ac.cn/software.  相似文献   

15.
PIMWalker™     
This article reports on PIMWalker, a free and interactive tool for visualising protein interaction networks. PIMWalker handles the unified molecular interaction (MI) format defined by members of the Proteomics Standards Initiative (the PSI MI format), and it is thus directly and easily usable by bench biologists. PIMWalker also comes with a documented, open-source Javatrade mark application programming interface allowing the bioinformatic programmer to easily extend the functions. AVAILABILITY: PIMWalker is available under a free license from http://pim.hybrigenics.com/pimwalker.  相似文献   

16.
The RPE65 protein is located in the retinal pigment epithelial cells and plays an important role in the visual cycle. Although numerous experimental results demonstrate that it participates in the visual cycle, its detailed structure and function are not clear yet because of difficulties in isolation and crystallization. This paper describes a computational modeling study to propose a three-dimensional (3D) structure and suggest a possible mechanism for the function of the protein. The 3D-PSSM server is used to obtain the preliminary 3D structural model of the RPE65 protein. The coordinates of the side chains are obtained from the SCWRL program. Finally, two software packages, Jackal and Tinker with the CHARMM force field are used to fix and refine the preliminary structural model. Based on the obtained 3D structural model, a possible mechanism for the protein function is discussed.  相似文献   

17.
《Genomics》2020,112(2):1245-1256
Genetic laboratories use custom-commercial targeted next-generation sequencing (tg-NGS) assays to identify disease-causing variants. Although the high coverage achieved with these tests allows for the detection of copy number variants (CNVs), which account for an important proportion of the genetic burden in human diseases, an easy-to-use tool for automatic CNV detection is still lacking. This article presents a new CNV detection tool optimized for tg-NGS data: PattRec. PattRec was evaluated using a wide range of data, and its performance compared with those of other CNV detection tools. The software includes features for selecting optimal controls, discarding polymorphic CNVs prior to analysis, and filtering out deletions based on SNV zygosity, and automatically creates an in-house CNV database. There is no need for high level bioinformatic expertise and users can choose color-coded xlsx output that helps to prioritize potentially pathogenic CNVs. PattRec is presented as a Java based GUI, freely available online: https://github.com/irotero/PattRec.  相似文献   

18.
During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology – understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure–function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server – a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure–function relationship. Zebra results are provided in two ways – (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra.  相似文献   

19.
Force-distance (F-D) curves of single membrane proteins reveal information on inter- and intramolecular interactions occurring within a protein and between proteins. However, the analysis of single-molecule force spectroscopy data is a time consuming and complex process requiring objective criteria. In most cases the user requires additional information to interpret F-D curves. Therefore we developed a software assistant representing the force or molecular interaction pattern and the topology or the 3D structure of the membrane protein. This representation establishes a basis for detailed interpretation of the protein structure and its underlying molecular interactions. Various integrated bioinformatic features further assist in the interpretation of measured and assigned molecular interactions that determine membrane protein folding, structure, stability and function. Web queries and programs about the topology are directly linked. Motifs, helix types, representation of Venn diagrams and the complete functionality of the program Jmol belong to it. AVAILABILITY: The program MPTV is freely available from the website at http://www.bioforscher.de/mptv.htm/.  相似文献   

20.
iSPOT (http://cbm.bio.uniroma2.it/ispot) is a web tool developed to infer the recognition specificity of protein module families; it is based on the SPOT procedure that utilizes information from position-specific contacts, derived from the available domain/ligand complexes of known structure, and experimental interaction data to build a database of residue-residue contact frequencies. iSPOT is available to infer the interaction specificity of PDZ, SH3 and WW domains. For each family of protein domains, iSPOT evaluates the probability of interaction between a query domain of the specified families and an input protein/peptide sequence and makes it possible to search for potential binding partners of a given domain within the SWISS-PROT database. The experimentally derived interaction data utilized to build the PDZ, SH3 and WW databases of residue-residue contact frequencies are also accessible. Here we describe the application to the WW family of protein modules.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号