首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Motivation

The precise prediction of protein domains, which are the structural, functional and evolutionary units of proteins, has been a research focus in recent years. Although many methods have been presented for predicting protein domains and boundaries, the accuracy of predictions could be improved.

Results

In this study we present a novel approach, DomHR, which is an accurate predictor of protein domain boundaries based on a creative hinge region strategy. A hinge region was defined as a segment of amino acids that covers part of a domain region and a boundary region. We developed a strategy to construct profiles of domain-hinge-boundary (DHB) features generated by sequence-domain/hinge/boundary alignment against a database of known domain structures. The DHB features had three elements: normalized domain, hinge, and boundary probabilities. The DHB features were used as input to identify domain boundaries in a sequence. DomHR used a nonredundant dataset as the training set, the DHB and predicted shape string as features, and a conditional random field as the classification algorithm. In predicted hinge regions, a residue was determined to be a domain or a boundary according to a decision threshold. After decision thresholds were optimized, DomHR was evaluated by cross-validation, large-scale prediction, independent test and CASP (Critical Assessment of Techniques for Protein Structure Prediction) tests. All results confirmed that DomHR outperformed other well-established, publicly available domain boundary predictors for prediction accuracy.

Availability

The DomHR is available at http://cal.tongji.edu.cn/domain/.  相似文献   

2.

Background

Targeting conserved proteins of bacteria through antibacterial medications has resulted in both the development of resistant strains and changes to human health by destroying beneficial microbes which eventually become breeding grounds for the evolution of resistances. Despite the availability of more than 800 genomes sequences, 430 pathways, 4743 enzymes, 9257 metabolic reactions and protein (three-dimensional) 3D structures in bacteria, no pathogen-specific computational drug target identification tool has been developed.

Methods

A web server, UniDrug-Target, which combines bacterial biological information and computational methods to stringently identify pathogen-specific proteins as drug targets, has been designed. Besides predicting pathogen-specific proteins essentiality, chokepoint property, etc., three new algorithms were developed and implemented by using protein sequences, domains, structures, and metabolic reactions for construction of partial metabolic networks (PMNs), determination of conservation in critical residues, and variation analysis of residues forming similar cavities in proteins sequences. First, PMNs are constructed to determine the extent of disturbances in metabolite production by targeting a protein as drug target. Conservation of pathogen-specific protein''s critical residues involved in cavity formation and biological function determined at domain-level with low-matching sequences. Last, variation analysis of residues forming similar cavities in proteins sequences from pathogenic versus non-pathogenic bacteria and humans is performed.

Results

The server is capable of predicting drug targets for any sequenced pathogenic bacteria having fasta sequences and annotated information. The utility of UniDrug-Target server was demonstrated for Mycobacterium tuberculosis (H37Rv). The UniDrug-Target identified 265 mycobacteria pathogen-specific proteins, including 17 essential proteins which can be potential drug targets.

Conclusions/Significance

UniDrug-Target is expected to accelerate pathogen-specific drug targets identification which will increase their success and durability as drugs developed against them have less chance to develop resistances and adverse impact on environment. The server is freely available at http://117.211.115.67/UDT/main.html. The standalone application (source codes) is available at http://www.bioinformatics.org/ftp/pub/bioinfojuit/UDT.rar.  相似文献   

3.

Background

Steroidogenic acute regulatory (StAR) protein related lipid transfer (START) domains are small globular modules that form a cavity where lipids and lipid hormones bind. These domains can transport ligands to facilitate lipid exchange between biological membranes, and they have been postulated to modulate the activity of other domains of the protein in response to ligand binding. More than a dozen human genes encode START domains, and several of them are implicated in a disease.

Principal Findings

We report crystal structures of the human STARD1, STARD5, STARD13 and STARD14 lipid transfer domains. These represent four of the six functional classes of START domains.

Significance

Sequence alignments based on these and previously reported crystal structures define the structural determinants of human START domains, both those related to structural framework and those involved in ligand specificity.

Enhanced version

This article can also be viewed as an enhanced version in which the text of the article is integrated with interactive 3D representations and animated transitions. Please note that a web plugin is required to access this enhanced functionality. Instructions for the installation and use of the web plugin are available in Text S1.  相似文献   

4.

Background

Studies on the myotonic dystrophy protein kinase (DMPK) gene and gene products have thus far mainly concentrated on the fate of length mutation in the (CTG)n repeat at the DNA level and consequences of repeat expansion at the RNA level in DM1 patients and disease models. Surprisingly little is known about the function of DMPK protein products.

Methodology/Principal Findings

We demonstrate here that transient expression of one major protein product of the human gene, the hDMPK A isoform with a long tail anchor, results in mitochondrial fragmentation and clustering in the perinuclear region. Clustering occurred in a variety of cell types and was enhanced by an intact tubulin cytoskeleton. In addition to morphomechanical changes, hDMPK A expression induces physiological changes like loss of mitochondrial membrane potential, increased autophagy activity, and leakage of cytochrome c from the mitochondrial intermembrane space accompanied by apoptosis. Truncation analysis using YFP-hDMPK A fusion constructs revealed that the protein''s tail domain was necessary and sufficient to evoke mitochondrial clustering behavior.

Conclusion/Significance

Our data suggest that the expression level of the DMPK A isoform needs to be tightly controlled in cells where the hDMPK gene is expressed. We speculate that aberrant splice isoform expression might be a codetermining factor in manifestation of specific DM1 features in patients.  相似文献   

5.

Background

In prokaryotes and some eukaryotes, genetic material can be transferred laterally among unrelated lineages and recombined into new host genomes, providing metabolic and physiological novelty. Although the process is usually framed in terms of gene sharing (e.g. lateral gene transfer, LGT), there is little reason to imagine that the units of transfer and recombination correspond to entire, intact genes. Proteins often consist of one or more spatially compact structural regions (domains) which may fold autonomously and which, singly or in combination, confer the protein''s specific functions. As LGT is frequent in strongly selective environments and natural selection is based on function, we hypothesized that domains might also serve as modules of genetic transfer, i.e. that regions of DNA that are transferred and recombined between lineages might encode intact structural domains of proteins.

Methodology/Principal Findings

We selected 1,462 orthologous gene sets representing 144 prokaryotic genomes, and applied a rigorous two-stage approach to identify recombination breakpoints within these sequences. Recombination breakpoints are very significantly over-represented in gene sets within which protein domain-encoding regions have been annotated. Within these gene sets, breakpoints significantly avoid the domain-encoding regions (domons), except where these regions constitute most of the sequence length. Recombination breakpoints that fall within longer domons are distributed uniformly at random, but those that fall within shorter domons may show a slight tendency to avoid the domon midpoint. As we find no evidence for differential selection against nucleotide substitutions following the recombination event, any bias against disruption of domains must be a consequence of the recombination event per se.

Conclusions/Significance

This is the first systematic study relating the units of LGT to structural features at the protein level. Many genes have been interrupted by recombination following inter-lineage genetic transfer, during which the regions within these genes that encode protein domains have not been preferentially preserved intact. Protein domains are units of function, but domons are not modules of transfer and recombination. Our results demonstrate that LGT can remodel even the most functionally conservative modules within genomes.  相似文献   

6.

Background

The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better.

Results

Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions.Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family.

Conclusions

CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.  相似文献   

7.

Background

Vitamins are typical ligands that play critical roles in various metabolic processes. The accurate identification of the vitamin-binding residues solely based on a protein sequence is of significant importance for the functional annotation of proteins, especially in the post-genomic era, when large volumes of protein sequences are accumulating quickly without being functionally annotated.

Results

In this paper, a new predictor called TargetVita is designed and implemented for predicting protein-vitamin binding residues using protein sequences. In TargetVita, features derived from the position-specific scoring matrix (PSSM), predicted protein secondary structure, and vitamin binding propensity are combined to form the original feature space; then, several feature subspaces are selected by performing different feature selection methods. Finally, based on the selected feature subspaces, heterogeneous SVMs are trained and then ensembled for performing prediction.

Conclusions

The experimental results obtained with four separate vitamin-binding benchmark datasets demonstrate that the proposed TargetVita is superior to the state-of-the-art vitamin-specific predictor, and an average improvement of 10% in terms of the Matthews correlation coefficient (MCC) was achieved over independent validation tests. The TargetVita web server and the datasets used are freely available for academic use at http://csbio.njust.edu.cn/bioinf/TargetVita or http://www.csbio.sjtu.edu.cn/bioinf/TargetVita.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-297) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

Voids and cavities in the native protein structure determine the pressure unfolding of proteins. In addition, the volume changes due to the interaction of newly exposed atoms with solvent upon protein unfolding also contribute to the pressure unfolding of proteins. Quantitative understanding of these effects is important for predicting and designing proteins with predefined response to changes in hydrostatic pressure using computational approaches. The molecular surface volume is a useful metric that describes contribution of geometrical volume, which includes van der Waals volume and volume of the voids, to the total volume of a protein in solution, thus isolating the effects of hydration for separate calculations.

Results

We developed ProteinVolume, a highly robust and easy-to-use tool to compute geometric volumes of proteins. ProteinVolume generates the molecular surface of a protein and uses an innovative flood-fill algorithm to calculate the individual components of the molecular surface volume, van der Waals and intramolecular void volumes. ProteinVolume is user friendly and is available as a web-server or a platform-independent command-line version.

Conclusions

ProteinVolume is a highly accurate and fast application to interrogate geometric volumes of proteins. ProteinVolume is a free web server available on http://gmlab.bio.rpi.edu. Free-standing platform-independent Java-based ProteinVolume executable is also freely available at this web site.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0531-2) contains supplementary material, which is available to authorized users.  相似文献   

9.

Background

Various methods have been developed to computationally predict hotspot residues at novel protein-protein interfaces. However, there are various challenges in obtaining accurate prediction. We have developed a novel method which uses different aspects of protein structure and sequence space at residue level to highlight interface residues crucial for the protein-protein complex formation.

Results

ECMIS (Energetic Conservation Mass Index and Spatial Clustering) algorithm was able to outperform existing hotspot identification methods. It was able to achieve around 80% accuracy with incredible increase in sensitivity and outperforms other existing methods. This method is even sensitive towards the hotspot residues contributing only small-scale hydrophobic interactions.

Conclusion

Combination of diverse features of the protein viz. energy contribution, extent of conservation, location and surrounding environment, along with optimized weightage for each feature, was the key for the success of the algorithm. The academic version of the algorithm is available at http://caps.ncbs.res.in/download/ECMIS/ECMIS.zip.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-303) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

Protein-protein interactions are critical to elucidating the role played by individual proteins in important biological pathways. Of particular interest are hub proteins that can interact with large numbers of partners and often play essential roles in cellular control. Depending on the number of binding sites, protein hubs can be classified at a structural level as singlish-interface hubs (SIH) with one or two binding sites, or multiple-interface hubs (MIH) with three or more binding sites. In terms of kinetics, hub proteins can be classified as date hubs (i.e., interact with different partners at different times or locations) or party hubs (i.e., simultaneously interact with multiple partners).

Methodology

Our approach works in 3 phases: Phase I classifies if a protein is likely to bind with another protein. Phase II determines if a protein-binding (PB) protein is a hub. Phase III classifies PB proteins as singlish-interface versus multiple-interface hubs and date versus party hubs. At each stage, we use sequence-based predictors trained using several standard machine learning techniques.

Conclusions

Our method is able to predict whether a protein is a protein-binding protein with an accuracy of 94% and a correlation coefficient of 0.87; identify hubs from non-hubs with 100% accuracy for 30% of the data; distinguish date hubs/party hubs with 69% accuracy and area under ROC curve of 0.68; and SIH/MIH with 89% accuracy and area under ROC curve of 0.84. Because our method is based on sequence information alone, it can be used even in settings where reliable protein-protein interaction data or structures of protein-protein complexes are unavailable to obtain useful insights into the functional and evolutionary characteristics of proteins and their interactions.

Availability

We provide a web server for our three-phase approach: http://hybsvm.gdcb.iastate.edu.  相似文献   

11.
12.

Background

The serine/threonine mammalian Ste-20 like kinases (MSTs) are key regulators of apoptosis, cellular proliferation as well as polarization. Deregulation of MSTs has been associated with disease progression in prostate and colorectal cancer. The four human MSTs are regulated differently by C-terminal regions flanking the catalytic domains.

Principal Findings

We have determined the crystal structure of kinase domain of MST4 in complex with an ATP-mimetic inhibitor. This is the first structure of an inactive conformation of a member of the MST kinase family. Comparison with active structures of MST3 and MST1 revealed a dimeric association of MST4 suggesting an activation loop exchanged mechanism of MST4 auto-activation. Together with a homology model of MST2 we provide a comparative analysis of the kinase domains for all four members of the human MST family.

Significance

The comparative analysis identified new structural features in the MST ATP binding pocket and has also defined the mechanism for autophosphorylation. Both structural features may be further explored for inhibitors design.

Enhanced version

This article can also be viewed as an enhanced version in which the text of the article is integrated with interactive 3D representations and animated transitions. Please note that a web plugin is required to access this enhanced functionality. Instructions for the installation and use of the web plugin are available in Text S1.  相似文献   

13.

Background

Proteomic studies of respiratory disorders have the potential to identify protein biomarkers for diagnosis and disease monitoring. Utilisation of sensitive quantitative proteomic methods creates opportunities to determine individual patient proteomes. The aim of the current study was to determine if quantitative proteomics of bronchial biopsies from asthmatics can distinguish relevant biological functions and whether inhaled glucocorticoid treatment affects these functions.

Methods

Endobronchial biopsies were taken from untreated asthmatic patients (n = 12) and healthy controls (n = 3). Asthmatic patients were randomised to double blind treatment with either placebo or budesonide (800 μg daily for 3 months) and new biopsies were obtained. Proteins extracted from the biopsies were digested and analysed using isobaric tags for relative and absolute quantitation combined with a nanoLC-LTQ Orbitrap mass spectrometer. Spectra obtained were used to identify and quantify proteins. Pathways analysis was performed using Ingenuity Pathway Analysis to identify significant biological pathways in asthma and determine how the expression of these pathways was changed by treatment.

Results

More than 1800 proteins were identified and quantified in the bronchial biopsies of subjects. The pathway analysis revealed acute phase response signalling, cell-to-cell signalling and tissue development associations with proteins expressed in asthmatics compared to controls. The functions and pathways associated with placebo and budesonide treatment showed distinct differences, including the decreased association with acute phase proteins as a result of budesonide treatment compared to placebo.

Conclusions

Proteomic analysis of bronchial biopsy material can be used to identify and quantify proteins using highly sensitive technologies, without the need for pooling of samples from several patients. Distinct pathophysiological features of asthma can be identified using this approach and the expression of these features is changed by inhaled glucocorticoid treatment. Quantitative proteomics may be applied to identify mechanisms of disease that may assist in the accurate and timely diagnosis of asthma.

Trial registration

ClinicalTrials.gov registration NCT01378039  相似文献   

14.
15.
16.

Background

Pathogenic bacteria infecting both animals as well as plants use various mechanisms to transport virulence factors across their cell membranes and channel these proteins into the infected host cell. The type III secretion system represents such a mechanism. Proteins transported via this pathway (“effector proteins”) have to be distinguished from all other proteins that are not exported from the bacterial cell. Although a special targeting signal at the N-terminal end of effector proteins has been proposed in literature its exact characteristics remain unknown.

Methodology/Principal Findings

In this study, we demonstrate that the signals encoded in the sequences of type III secretion system effectors can be consistently recognized and predicted by machine learning techniques. Known protein effectors were compiled from the literature and sequence databases, and served as training data for artificial neural networks and support vector machine classifiers. Common sequence features were most pronounced in the first 30 amino acids of the effector sequences. Classification accuracy yielded a cross-validated Matthews correlation of 0.63 and allowed for genome-wide prediction of potential type III secretion system effectors in 705 proteobacterial genomes (12% predicted candidates protein), their chromosomes (11%) and plasmids (13%), as well as 213 Firmicute genomes (7%).

Conclusions/Significance

We present a signal prediction method together with comprehensive survey of potential type III secretion system effectors extracted from 918 published bacterial genomes. Our study demonstrates that the analyzed signal features are common across a wide range of species, and provides a substantial basis for the identification of exported pathogenic proteins as targets for future therapeutic intervention. The prediction software is publicly accessible from our web server (www.modlab.org).  相似文献   

17.

Background

Computational prediction of protein interactions typically use protein domains as classifier features because they capture conserved information of interaction surfaces. However, approaches relying on domains as features cannot be applied to proteins without any domain information. In this paper, we explore the contribution of pure amino acid composition (AAC) for protein interaction prediction. This simple feature, which is based on normalized counts of single or pairs of amino acids, is applicable to proteins from any sequenced organism and can be used to compensate for the lack of domain information.

Results

AAC performed at par with protein interaction prediction based on domains on three yeast protein interaction datasets. Similar behavior was obtained using different classifiers, indicating that our results are a function of features and not of classifiers. In addition to yeast datasets, AAC performed comparably on worm and fly datasets. Prediction of interactions for the entire yeast proteome identified a large number of novel interactions, the majority of which co-localized or participated in the same processes. Our high confidence interaction network included both well-studied and uncharacterized proteins. Proteins with known function were involved in actin assembly and cell budding. Uncharacterized proteins interacted with proteins involved in reproduction and cell budding, thus providing putative biological roles for the uncharacterized proteins.

Conclusion

AAC is a simple, yet powerful feature for predicting protein interactions, and can be used alone or in conjunction with protein domains to predict new and validate existing interactions. More importantly, AAC alone performs at par with existing, but more complex, features indicating the presence of sequence-level information that is predictive of interaction, but which is not necessarily restricted to domains.  相似文献   

18.

Background

Whey proteins have insulinogenic properties and the effect appears to originate from a specific postprandial plasma amino acid pattern. The insulinogenic effect can be mimicked by a specific mixture of the five amino acids iso, leu, lys, thr and val.

Objective

The objective was to evaluate the efficacy of pre-meal boluses of whey or soy protein with or without added amino acids on glycaemia, insulinemia as well as on plasma responses of incretins and amino acids at a subsequent composite meal. Additionally, plasma ghrelin and subjective appetite responses were studied.

Design

In randomized order, fourteen healthy volunteers were served a standardized composite ham sandwich meal with either water provided (250 ml) during the time course of the meal, or different pre-meal protein drinks (PMPD) (100 ml provided as a bolus) with additional water (150 ml) served to the meal. The PMPDs contained 9 g protein and were based on either whey or soy protein isolates, with or without addition of the five amino acids (iso, leu, lys, thr and val) or the five amino acids + arg.

Results

All PMPD meals significantly reduced incremental area for plasma glucose response (iAUC) during the first 60 min. All whey based PMPD meals displayed lower glycemic indices compared to the reference meal. There were no significant differences for the insulinemic indices. The early insulin response (iAUC 0–15 min) correlated positively to plasma amino acids, GIP and GLP-1 as well as to the glycemic profile. Additionally, inverse correlations were found between insulin iAUC 0–15 min and the glucose peak.

Conclusion

The data suggests that a pre-meal drink containing specific proteins/amino acids significantly reduces postprandial glycemia following a composite meal, in absence of elevated insulinemic excursions. An early phase insulinemic response induced by plasma amino acids and incretins appears to mediate the effect.

Trial Registration

ClinicalTrials.gov NCT01586780<NCT01586780>  相似文献   

19.

Background

The need to create controlled vocabularies such as ontologies for knowledge organization and access has been widely recognized in various domains. Despite the indispensable need of thorough domain knowledge in ontology construction, most software tools for ontology construction are designed for knowledge engineers and not for domain experts to use. The differences in the opinions of different domain experts and in the terminology usages in source literature are rarely addressed by existing software.

Methods

OTO software was developed based on the Agile principles. Through iterations of software release and user feedback, new features are added and existing features modified to make the tool more intuitive and efficient to use for small and large data sets. The software is open source and built in Java.

Results

Ontology Term Organizer (OTO; http://biosemantics.arizona.edu/OTO/) is a user-friendly, web-based, consensus-promoting, open source application for organizing domain terms by dragging and dropping terms to appropriate locations. The application is designed for users with specific domain knowledge such as biology but not in-depth ontology construction skills. Specifically OTO can be used to establish is_a, part_of, synonym, and order relationships among terms in any domain that reflects the terminology usage in source literature and based on multiple experts’ opinions. The organized terms may be fed into formal ontologies to boost their coverage. All datasets organized on OTO are publicly available.

Conclusion

OTO has been used to organize the terms extracted from thirty volumes of Flora of North America and Flora of China combined, in addition to some smaller datasets of different taxon groups. User feedback indicates that the tool is efficient and user friendly. Being open source software, the application can be modified to fit varied term organization needs for different domains.  相似文献   

20.
Liu X  Liu B  Huang Z  Shi T  Chen Y  Zhang J 《PloS one》2012,7(1):e30938

Background

The molecular network sustained by different types of interactions among proteins is widely manifested as the fundamental driving force of cellular operations. Many biological functions are determined by the crosstalk between proteins rather than by the characteristics of their individual components. Thus, the searches for protein partners in global networks are imperative when attempting to address the principles of biology.

Results

We have developed a web-based tool “Sequence-based Protein Partners Search” (SPPS) to explore interacting partners of proteins, by searching over a large repertoire of proteins across many species. SPPS provides a database containing more than 60,000 protein sequences with annotations and a protein-partner search engine in two modes (Single Query and Multiple Query). Two interacting proteins of human FBXO6 protein have been found using the service in the study. In addition, users can refine potential protein partner hits by using annotations and possible interactive network in the SPPS web server.

Conclusions

SPPS provides a new type of tool to facilitate the identification of direct or indirect protein partners which may guide scientists on the investigation of new signaling pathways. The SPPS server is available to the public at http://mdl.shsmu.edu.cn/SPPS/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号