共查询到20条相似文献,搜索用时 31 毫秒
1.
Ines Wagner Michael Volkmer Malvika Sharan Jose M Villaveces Felix Oswald Vineeth Surendranath Bianca H Habermann 《BMC bioinformatics》2014,15(1)
Background
Searching the orthologs of a given protein or DNA sequence is one of the most important and most commonly used Bioinformatics methods in Biology. Programs like BLAST or the orthology search engine Inparanoid can be used to find orthologs when the similarity between two sequences is sufficiently high. They however fail when the level of conservation is low. The detection of remotely conserved proteins oftentimes involves sophisticated manual intervention that is difficult to automate.Results
Here, we introduce morFeus, a search program to find remotely conserved orthologs. Based on relaxed sequence similarity searches, morFeus selects sequences based on the similarity of their alignments to the query, tests for orthology by iterative reciprocal BLAST searches and calculates a network score for the resulting network of orthologs that is a measure of orthology independent of the E-value. Detecting remotely conserved orthologs of a protein using morFeus thus requires no manual intervention. We demonstrate the performance of morFeus by comparing it to state-of-the-art orthology resources and methods. We provide an example of remotely conserved orthologs, which were experimentally shown to be functionally equivalent in the respective organisms and therefore meet the criteria of the orthology-function conjecture.Conclusions
Based on our results, we conclude that morFeus is a powerful and specific search method for detecting remotely conserved orthologs. morFeus is freely available at http://bio.biochem.mpg.de/morfeus/. Its source code is available from Sourceforge.net (https://sourceforge.net/p/morfeus/).Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-263) contains supplementary material, which is available to authorized users. 相似文献2.
Background
Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution.Results
Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.Conclusions
The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0399-6) contains supplementary material, which is available to authorized users. 相似文献3.
Lei Li Hin-chung Wong Wenyan Nong Man Kit Cheung Patrick Tik Wan Law Kai Man Kam Hoi Shan Kwan 《BMC genomics》2014,15(1)
Background
Vibrio parahaemolyticus is a Gram-negative halophilic bacterium. Infections with the bacterium could become systemic and can be life-threatening to immunocompromised individuals. Genome sequences of a few clinical isolates of V. parahaemolyticus are currently available, but the genome dynamics across the species and virulence potential of environmental strains on a genome-scale have not been described before.Results
Here we present genome sequences of four V. parahaemolyticus clinical strains from stool samples of patients and five environmental strains in Hong Kong. Phylogenomics analysis based on single nucleotide polymorphisms revealed a clear distinction between the clinical and environmental isolates. A new gene cluster belonging to the biofilm associated proteins of V. parahaemolyticus was found in clincial strains. In addition, a novel small genomic island frequently found among clinical isolates was reported. A few environmental strains were found harboring virulence genes and prophage elements, indicating their virulence potential. A unique biphenyl degradation pathway was also reported. A database for V. parahaemolyticus (http://kwanlab.bio.cuhk.edu.hk/vp) was constructed here as a platform to access and analyze genome sequences and annotations of the bacterium.Conclusions
We have performed a comparative genomics analysis of clinical and environmental strains of V. parahaemolyticus. Our analyses could facilitate understanding of the phylogenetic diversity and niche adaptation of this bacterium.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-1135) contains supplementary material, which is available to authorized users. 相似文献4.
Motivation
Type III Secretion Systems (T3SSs) play important roles in the interaction between gram-negative bacteria and their hosts. T3SSs function by translocating a group of bacterial effector proteins into the host cytoplasm. The details of specific type III secretion process are yet to be clarified. This research focused on comparing the amino acid composition within the N-terminal 100 amino acids from type III secretion (T3S) signal sequences or non-T3S proteins, specifically whether each residue exerts a constraint on residues found in adjacent positions. We used these comparisons to set up a statistic model to quantitatively model and effectively distinguish T3S effectors.Results
In this study, the amino acid composition (Aac) probability profiles conditional on its sequentially preceding position and corresponding amino acids were compared between N-terminal sequences of T3S and non-T3S proteins. The profiles are generally different. A Markov model, namely T3_MM, was consequently designed to calculate the total Aac conditional probability difference, i.e., the likelihood ratio of a sequence being a T3S or a non-T3S protein. With T3_MM, known T3S and non-T3S proteins were found to well approximate two distinct normal distributions. The model could distinguish validated T3S and non-T3S proteins with a 5-fold cross-validation sensitivity of 83.9% at a specificity of 90.3%. T3_MM was also shown to be more robust, accurate, simple, and statistically quantitative, when compared with other T3S protein prediction models. The high effectiveness of T3_MM also indicated the overall Aac difference between N-termini of T3S and non-T3S proteins, and the constraint of Aac exerted by its preceding position and corresponding Aac.Availability
An R package for T3_MM is freely downloadable from: http://biocomputer.bio.cuhk.edu.hk/softwares/T3_MM. T3_MM web server: http://biocomputer.bio.cuhk.edu.hk/T3DB/T3_MM.php. 相似文献5.
Kersten Villringer Ulrike Grittner Lars-Arne Schaafs Christian H. Nolte Heinrich Audebert Jochen B. Fiebach 《PloS one》2014,9(10)
Background
There is an ongoing debate whether stroke patients presenting with minor or moderate symptoms benefit from thrombolysis. Up until now, stroke severity on admission is typically measured with the NIHSS, and subsequently used for treatment decision.Hypothesis
Acute MRI lesion volume assessment can aid in therapy decision for iv-tPA in minor stroke.Methods
We analysed 164 patients with NIHSS 0–7 from a prospective stroke MRI registry, the 1000+ study (clinicaltrials.org ). Patients were examined in a 3 T MRI scanner and either received (n = 62) or did not receive thrombolysis (n = 102). DWI (diffusion weighted imaging) and PI (perfusion imaging) at admission were evaluated for diffusion - perfusion mismatch. Our primary outcome parameter was final lesion volume, defined by lesion volume on day 6 FLAIR images. NCT00715533Results
The association between t-PA and FLAIR lesion volume on day 6 was significantly different for patients with smaller DWI volume compared to patients with larger DWI volume (interaction between DWI and t-PA: p = 0.021). Baseline DWI lesion volume was dichotomized at the median (0.7 ml): final lesion volume at day 6 was larger in patients with large baseline DWI volumes without t-PA treatment (median difference 3, IQR −0.4–9.3 ml). Conversely, in patients with larger baseline DWI volumes final lesion volumes were smaller after t-PA treatment (median difference 0, IQR −4.1–5 ml). However, this did not translate into a significant difference in the mRS at day 90 (p = 0.577).Conclusion
Though this study is only hypothesis generating considering the number of cases, we believe that the size of DWI lesion volume may support therapy decision in patients with minor stroke.Trial Registration
Clinicaltrials.org NCT00715533相似文献6.
7.
Background
Vitamins are typical ligands that play critical roles in various metabolic processes. The accurate identification of the vitamin-binding residues solely based on a protein sequence is of significant importance for the functional annotation of proteins, especially in the post-genomic era, when large volumes of protein sequences are accumulating quickly without being functionally annotated.Results
In this paper, a new predictor called TargetVita is designed and implemented for predicting protein-vitamin binding residues using protein sequences. In TargetVita, features derived from the position-specific scoring matrix (PSSM), predicted protein secondary structure, and vitamin binding propensity are combined to form the original feature space; then, several feature subspaces are selected by performing different feature selection methods. Finally, based on the selected feature subspaces, heterogeneous SVMs are trained and then ensembled for performing prediction.Conclusions
The experimental results obtained with four separate vitamin-binding benchmark datasets demonstrate that the proposed TargetVita is superior to the state-of-the-art vitamin-specific predictor, and an average improvement of 10% in terms of the Matthews correlation coefficient (MCC) was achieved over independent validation tests. The TargetVita web server and the datasets used are freely available for academic use at http://csbio.njust.edu.cn/bioinf/TargetVita or http://www.csbio.sjtu.edu.cn/bioinf/TargetVita.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-297) contains supplementary material, which is available to authorized users. 相似文献8.
Background
DAVID is the most popular tool for interpreting large lists of gene/proteins classically produced in high-throughput experiments. However, the use of DAVID website becomes difficult when analyzing multiple gene lists, for it does not provide an adequate visualization tool to show/compare multiple enrichment results in a concise and informative manner.Result
We implemented a new R-based graphical tool, BACA (Bubble chArt to Compare Annotations), which uses the DAVID web service for cross-comparing enrichment analysis results derived from multiple large gene lists. BACA is implemented in R and is freely available at the CRAN repository (http://cran.r-project.org/web/packages/BACA/).Conclusion
The package BACA allows R users to combine multiple annotation charts into one output graph by passing DAVID website.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0477-4) contains supplementary material, which is available to authorized users. 相似文献9.
Background
DNA-binding proteins are vital for the study of cellular processes. In recent genome engineering studies, the identification of proteins with certain functions has become increasingly important and needs to be performed rapidly and efficiently. In previous years, several approaches have been developed to improve the identification of DNA-binding proteins. However, the currently available resources are insufficient to accurately identify these proteins. Because of this, the previous research has been limited by the relatively unbalanced accuracy rate and the low identification success of the current methods.Results
In this paper, we explored the practicality of modelling DNA binding identification and simultaneously employed an ensemble classifier, and a new predictor (nDNA-Prot) was designed. The presented framework is comprised of two stages: a 188-dimension feature extraction method to obtain the protein structure and an ensemble classifier designated as imDC. Experiments using different datasets showed that our method is more successful than the traditional methods in identifying DNA-binding proteins. The identification was conducted using a feature that selected the minimum Redundancy and Maximum Relevance (mRMR). An accuracy rate of 95.80% and an Area Under the Curve (AUC) value of 0.986 were obtained in a cross validation. A test dataset was tested in our method and resulted in an 86% accuracy, versus a 76% using iDNA-Prot and a 68% accuracy using DNA-Prot.Conclusions
Our method can help to accurately identify DNA-binding proteins, and the web server is accessible at http://datamining.xmu.edu.cn/~songli/nDNA. In addition, we also predicted possible DNA-binding protein sequences in all of the sequences from the UniProtKB/Swiss-Prot database.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-298) contains supplementary material, which is available to authorized users. 相似文献10.
Sandhya P Tiwari Edvin Fuglebakk Siv M Hollup Lars Skj?rven Tristan Cragnolini Svenn H Grindhaug Kidane M Tekle Nathalie Reuter 《BMC bioinformatics》2014,15(1)
Background
Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics–function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins.Results
We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma.Conclusion
WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0427-6) contains supplementary material, which is available to authorized users. 相似文献11.
12.
13.
Prashant Shingate Malini Manoharan Anshul Sukhwal Ramanathan Sowdhamini 《BMC bioinformatics》2014,15(1)
Background
Various methods have been developed to computationally predict hotspot residues at novel protein-protein interfaces. However, there are various challenges in obtaining accurate prediction. We have developed a novel method which uses different aspects of protein structure and sequence space at residue level to highlight interface residues crucial for the protein-protein complex formation.Results
ECMIS (Energetic Conservation Mass Index and Spatial Clustering) algorithm was able to outperform existing hotspot identification methods. It was able to achieve around 80% accuracy with incredible increase in sensitivity and outperforms other existing methods. This method is even sensitive towards the hotspot residues contributing only small-scale hydrophobic interactions.Conclusion
Combination of diverse features of the protein viz. energy contribution, extent of conservation, location and surrounding environment, along with optimized weightage for each feature, was the key for the success of the algorithm. The academic version of the algorithm is available at http://caps.ncbs.res.in/download/ECMIS/ECMIS.zip.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-303) contains supplementary material, which is available to authorized users. 相似文献14.
Patricia A Cassano Kristin A Guertin Alan R Kristal Kathryn E Ritchie Monica L Bertoia Kathryn B Arnold John J Crowley JoAnn Hartline Phyllis J Goodman Catherine M Tangen Lori M Minasian Scott M Lippman Eric Klein 《Respiratory research》2015,16(1)
Background
The intake of nutrients with antioxidant properties is hypothesized to augment antioxidant defenses, decrease oxidant damage to tissues, and attenuate age-related rate of decline in lung function. The objective was to determine whether long-term intervention with selenium and/or vitamin E supplements attenuates the annual rate of decline in lung function, particularly in cigarette smokers.Methods
The Respiratory Ancillary Study (RAS) tested the single and joint effects of selenium (200 μg/d L-selenomethionine) and vitamin E (400 IU/day all rac-α-tocopheryl acetate) in a randomized double-blind placebo-controlled trial. At the end of the intervention, 1,641 men had repeated pulmonary function tests separated by an average of 3 years. Linear mixed-effects regression models estimated the effect of intervention on annual rate of decline in lung function.Results
Compared to placebo, intervention had no main effect on either forced expiratory volume in the first second (FEV1) or forced expiratory flow (FEF25–75). There was no evidence for a smoking by treatment interaction for FEV1, but selenium attenuated rate of decline in FEF25–75 in current smokers (P = 0.0219). For current smokers randomized to selenium, annual rate of decline in FEF25–75 was similar to the annual decline experienced by never smokers randomized to placebo, with consistent effects for selenium alone and combined with vitamin E.Conclusions
Among all men, there was no effect of selenium and/or vitamin E supplementation on rate of lung function decline. However, current smokers randomized to selenium had an attenuated rate of decline in FEF25–75, a marker of airflow.Trial registration
Clinicaltrials.gov identifier: NCT00241865.Electronic supplementary material
The online version of this article (doi:10.1186/s12931-015-0195-5) contains supplementary material, which is available to authorized users. 相似文献15.
Edward Daniel Goodluck U. Onwukwe Rik K. Wierenga Susan E. Quaggin Seppo J. Vainio Mirja Krause 《BMC bioinformatics》2015,16(1)
Background
Codon usage plays a crucial role when recombinant proteins are expressed in different organisms. This is especially the case if the codon usage frequency of the organism of origin and the target host organism differ significantly, for example when a human gene is expressed in E. coli. Therefore, to enable or enhance efficient gene expression it is of great importance to identify rare codons in any given DNA sequence and subsequently mutate these to codons which are more frequently used in the expression host.Results
We describe an open-source web-based application, ATGme, which can in a first step identify rare and highly rare codons from most organisms, and secondly gives the user the possibility to optimize the sequence.Conclusions
This application provides a simple user-friendly interface utilizing three optimization strategies: 1. one-click optimization, 2. bulk optimization (by codon-type), 3. individualized custom (codon-by-codon) optimization. ATGme is an open-source application which is freely available at: http://atgme.org 相似文献16.
17.
Christopher K Hobbs Michelle Leung Herbert H Tsang H Alexander Ebhardt 《BMC bioinformatics》2014,15(1)
Background
A typical affinity purification coupled to mass spectrometry (AP-MS) experiment includes the purification of a target protein (bait) using an antibody and subsequent mass spectrometry analysis of all proteins co-purifying with the bait (aka prey proteins). Like any other systems biology approach, AP-MS experiments generate a lot of data and visualization has been challenging, especially when integrating AP-MS experiments with orthogonal datasets.Results
We present Circular Interaction Graph for Proteomics (CIG-P), which generates circular diagrams for visually appealing final representation of AP-MS data. Through a Java based GUI, the user inputs experimental and reference data as file in csv format. The resulting circular representation can be manipulated live within the GUI before exporting the diagram as vector graphic in pdf format. The strength of CIG-P is the ability to integrate orthogonal datasets with each other, e.g. affinity purification data of kinase PRPF4B in relation to the functional components of the spliceosome. Further, various AP-MS experiments can be compared to each other.Conclusions
CIG-P aids to present AP-MS data to a wider audience and we envision that the tool finds other applications too, e.g. kinase – substrate relationships as a function of perturbation. CIG-P is available under: http://sourceforge.net/projects/cig-p/Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-344) contains supplementary material, which is available to authorized users. 相似文献18.
19.
Background
With the advent of low cost, fast sequencing technologies metagenomic analyses are made possible. The large data volumes gathered by these techniques and the unpredictable diversity captured in them are still, however, a challenge for computational biology.Results
In this paper we address the problem of rapid taxonomic assignment with small and adaptive data models (< 5 MB) and present the accelerated k-mer explorer (AKE). Acceleration in AKE’s taxonomic assignments is achieved by a special machine learning architecture, which is well suited to model data collections that are intrinsically hierarchical. We report classification accuracy reasonably well for ranks down to order, observed on a study on real world data (Acid Mine Drainage, Cow Rumen).Conclusion
We show that the execution time of this approach is orders of magnitude shorter than competitive approaches and that accuracy is comparable. The tool is presented to the public as a web application (url: https://ani.cebitec.uni-bielefeld.de/ake/, username: bmc, password: bmcbioinfo).Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0384-0) contains supplementary material, which is available to authorized users. 相似文献20.
Mutharasu Gnanavel Prachi Mehrotra Ramaswamy Rakshambikai Juliette Martin Narayanaswamy Srinivasan Ramachandra M Bhaskara 《BMC bioinformatics》2014,15(1)