首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Transporters are proteins that are involved in the movement of ions or molecules across biological membranes. Transporters are generally classified into channels/pores, electrochemical transporters, and active transporters. Discriminating the specific class of transporters and their subfamilies are essential tasks in computational biology for the advancement of structural and functional genomics. We have systematically analyzed the amino acid composition, residue pair preference and amino acid properties in six different families of transporters. Utilizing the information, we have developed a radial basis function (RBF) network method based on profiles obtained with position specific scoring matrices for discriminating transporters belonging to three different classes and six families. Our method showed a fivefold cross validation accuracy of 76%, 73%, and 69% for discriminating transporters and nontransporters, three different classes and six different families of transporters, respectively. Further, the method was tested with independent datasets, which showed similar level of accuracy. A web server has been developed for discriminating transporters based on three classes and six families, and it is available at http://rbf.bioinfo.tw/~sachen/tcrbf.html . We suggest that our method could be effectively used to identify transporters and discriminating them into different classes and families. Proteins 2010;. © 2010 Wiley‐Liss, Inc.  相似文献   

2.
Negative examples – genes that are known not to carry out a given protein function – are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html).  相似文献   

3.
MOTIVATION: Subcellular localization is a key functional characteristic of proteins. A fully automatic and reliable prediction system for protein subcellular localization is needed, especially for the analysis of large-scale genome sequences. RESULTS: In this paper, Support Vector Machine has been introduced to predict the subcellular localization of proteins from their amino acid compositions. The total prediction accuracies reach 91.4% for three subcellular locations in prokaryotic organisms and 79.4% for four locations in eukaryotic organisms. Predictions by our approach are robust to errors in the protein N-terminal sequences. This new approach provides superior prediction performance compared with existing algorithms based on amino acid composition and can be a complementary method to other existing methods based on sorting signals. AVAILABILITY: A web server implementing the prediction method is available at http://www.bioinfo.tsinghua.edu.cn/SubLoc/. SUPPLEMENTARY INFORMATION: Supplementary material is available at http://www.bioinfo.tsinghua.edu.cn/SubLoc/.  相似文献   

4.
TSdb (http://tsdb.cbi.pku.edu.cn) is the first manually curated central repository that stores formatted information on the substrates of transporters. In total, 37608 transporters with 15075 substrates from 884 organisms were curated from UniProt functional annotation. A unique feature of TSdb is that all the substrates are mapped to identifiers from the KEGG Ligand compound database. Thus, TSdb links current metabolic pathway schema with compound transporter systems via the shared compounds in the pathways. Furthermore, all the transporter substrates in TSdb are classified according to their biochemical properties, biological roles and subcellular localizations. In addition to the functional annotation of transporters, extensive compound annotation that includes inhibitor information from the KEGG Ligand and BRENDA databases has been integrated, making TSdb a useful source for the discovery of potential inhibitory mechanisms linking transporter substrates and metabolic enzymes. User-friendly web interfaces are designed for easy access, query and download of the data. Text and BLAST searches against all transporters in the database are provided. We will regularly update the substrate data with evidence from new publications.  相似文献   

5.
The recent advancements in genome sequencing make it possible for the comparative analyses of essential cellular processes like transport in organisms across the three domains of life. Membrane transporters play crucial roles in fundamental cellular processes and functions in prokaryotic systems. Between 3 and 16% of open reading frames in prokaryotic genomes were predicted to encode membrane transport proteins, emphasizing the importance of transporters in their lifestyles. Hierarchical clustering of phylogenetic profiles of transporter families, which are derived from the presence or absence of a certain transporter family, showed distinct clustering patterns for obligate intracellular organisms, plant/soil-associated microbes and autotrophs. Obligate intracellular organisms possess the fewest types and number of transporters presumably due to their relatively stable living environment, while plant/soil-associated organisms generally encode the largest variety and number of transporters. A group of autotrophs are clustered together largely due to their absence of transporters for carbohydrate and organic nutrients and the presence of transporters for inorganic nutrients. Inside of each group, organisms are further clustered by their phylogenetic properties. These findings strongly suggest the correlation of transporter profiles to both evolutionary history and the overall physiology and lifestyles of the organisms.  相似文献   

6.
Ammonium is an excellent nitrogen source, and ammonium transfer is a fundamental process in most organisms. Membrane transport of ammonium is the key component of nitrogen metabolism mediated by Ammonium Transporter/Methylamine Permease/Rhesus (AMT/MEP/Rh) protein family. Ammonium transporters play different physiological roles in various organisms. Here, we looked at the protein characteristics of ammonium transporters in different organisms to create a link between protein characteristics and the organism. In order to increase the accuracy and precision of the employed models, for the first time, an attempt was made to cover all structural aspects of ammonium transporters in animals, bacteria, fungi, plants, and human by extracting and calculating 874 protein attributes of primary, secondary, and tertiary structures for each ammonium transporter. Then, various weighting and modeling algorithms were applied to determine how structural protein features change between organisms. Considering a large number of protein attributes made it possible to detect key protein characteristics in the structure of ammonium transporters. The results, for the first time, indicated that His-based features including count/frequency of His and frequency/count of Ile-His were the most significant features generating different types of ammonium transporters within organisms. Within different tested models, the C5.0 model was the most efficient and precise model for discrimination of organism type, based on ammonium transporter sequence, with the precision of 94.85%. The determination of protein characteristics of ammonium transporters in different organisms provides a new vista for understanding the evolution of transporters based on the modulation of protein characteristics and facilitates engineering of new transporters. In our point of view, dissecting a large number of structural protein characteristics through data mining algorithms provides a novel functional strategy for studying evolution and phylogeny. This research will serve as a basis for future studies on engineering novel ammonium transporters.  相似文献   

7.
8.
As proteomic data sets increase in size and complexity, the necessity for database‐centric software systems able to organize, compare, and visualize all the proteomic experiments in a lab grows. We recently developed an integrated platform called high‐throughput autonomous proteomic pipeline (HTAPP) for the automated acquisition and processing of quantitative proteomic data, and integration of proteomic results with existing external protein information resources within a lab‐based relational database called PeptideDepot. Here, we introduce the peptide validation software component of this system, which combines relational database‐integrated electronic manual spectral annotation in Java with a new software tool in the R programming language for the generation of logistic regression spectral models from user‐supplied validated data sets and flexible application of these user‐generated models in automated proteomic workflows. This logistic regression spectral model uses both variables computed directly from SEQUEST output in addition to deterministic variables based on expert manual validation criteria of spectral quality. In the case of linear quadrupole ion trap (LTQ) or LTQ‐FTICR LC/MS data, our logistic spectral model outperformed both XCorr (242% more peptides identified on average) and the X!Tandem E‐value (87% more peptides identified on average) at a 1% false discovery rate estimated by decoy database approach.  相似文献   

9.
MOTIVATION: Biological pathways provide significant insights on the interaction mechanisms of molecules. Presently, many essential pathways still remain unknown or incomplete for newly sequenced organisms. Moreover, experimental validation of enormous numbers of possible pathway candidates in a wet-lab environment is time- and effort-extensive. Thus, there is a need for comparative genomics tools that help scientists predict pathways in an organism's biological network. RESULTS: In this article, we propose a technique to discover unknown pathways in organisms. Our approach makes in-depth use of Gene Ontology (GO)-based functionalities of enzymes involved in metabolic pathways as follows: i. Model each pathway as a biological functionality graph of enzyme GO functions, which we call pathway functionality template. ii. Locate frequent pathway functionality patterns so as to infer previously unknown pathways through pattern matching in metabolic networks of organisms. We have experimentally evaluated the accuracy of the presented technique for 30 bacterial organisms to predict around 1500 organism-specific versions of 50 reference pathways. Using cross-validation strategy on known pathways, we have been able to infer pathways with 86% precision and 72% recall for enzymes (i.e. nodes). The accuracy of the predicted enzyme relationships has been measured at 85% precision with 64% recall. AVAILABILITY: Code upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

10.
Improved method for predicting beta-turn using support vector machine   总被引:2,自引:0,他引:2  
MOTIVATION: Numerous methods for predicting beta-turns in proteins have been developed based on various computational schemes. Here, we introduce a new method of beta-turn prediction that uses the support vector machine (SVM) algorithm together with predicted secondary structure information. Various parameters from the SVM have been adjusted to achieve optimal prediction performance. RESULTS: The SVM method achieved excellent performance as measured by the Matthews correlation coefficient (MCC = 0.45) using a 7-fold cross validation on a database of 426 non-homologous protein chains. To our best knowledge, this MCC value is the highest achieved so far for predicting beta-turn. The overall prediction accuracy Qtotal was 77.3%, which is the best among the existing prediction methods. Among its unique attractive features, the present SVM method avoids overtraining and compresses information and provides a predicted reliability index.  相似文献   

11.
The notorious difficulty of elucidating structures of membrane transporters by crystallography has long prevented our understanding of active transport mechanism coupled with ion/proton transport. The determination of the first crystal structure of the drug/H+ antiporter AcrB was a breakthrough for structure-based understanding of drug/H+ antiport. However, although AcrB is a major multidrug exporter in Gram-negative organisms, the majority of bacterial drug exporters are major facilitator superfamily (MFS) drug transporters. As no crystal structures have been solved for MFS transporters, the alternative protein-engineering methods are still very useful for estimating structures and functions of drug/H+ antiporters. This review describes this alternative approach for investigating the structure and function of tetracycline/H+ antiporters.  相似文献   

12.
13.

Background

An important task in a metagenomic analysis is the assignment of taxonomic labels to sequences in a sample. Most widely used methods for taxonomy assignment compare a sequence in the sample to a database of known sequences. Many approaches use the best BLAST hit(s) to assign the taxonomic label. However, it is known that the best BLAST hit may not always correspond to the best taxonomic match. An alternative approach involves phylogenetic methods, which take into account alignments and a model of evolution in order to more accurately define the taxonomic origin of sequences. Similarity-search based methods typically run faster than phylogenetic methods and work well when the organisms in the sample are well represented in the database. In contrast, phylogenetic methods have the capability to identify new organisms in a sample but are computationally quite expensive.

Results

We propose a two-step approach for metagenomic taxon identification; i.e., use a rapid method that accurately classifies sequences using a reference database (this is a filtering step) and then use a more complex phylogenetic method for the sequences that were unclassified in the previous step. In this work, we explore whether and when using top BLAST hit(s) yields a correct taxonomic label. We develop a method to detect outliers among BLAST hits in order to separate the phylogenetically most closely related matches from matches to sequences from more distantly related organisms. We used modified BILD (Bayesian Integral Log-Odds) scores, a multiple-alignment scoring function, to define the outliers within a subset of top BLAST hits and assign taxonomic labels. We compared the accuracy of our method to the RDP classifier and show that our method yields fewer misclassifications while properly classifying organisms that are not present in the database. Finally, we evaluated the use of our method as a pre-processing step before more expensive phylogenetic analyses (in our case TIPP) in the context of real 16S rRNA datasets.

Conclusion

Our experiments make a good case for using a two-step approach for accurate taxonomic assignment. We show that our method can be used as a filtering step before using phylogenetic methods and provides a way to interpret BLAST results using more information than provided by E-values and bit-scores alone.
  相似文献   

14.
For structure-based drug design, where various ligand structures need to be docked to a target protein structure, a docking method that can handle conformational flexibility of not only the ligand, but also the protein, is indispensable. We have developed a simple and effective approach for dealing with the local induced-fit motion of the target protein, and implemented it in our docking tool, ADAM. Our approach efficiently combines the following two strategies: a vdW-offset grid in which the protein cavity is enlarged uniformly, and structure optimization allowing the motion of ligand and protein atoms. To examine the effectiveness of our approach, we performed docking validation studies, including redocking in 18 test cases and foreign-docking, in which various ligands from foreign crystal structures of complexes are docked into a target protein structure, in 22 cases (on five target proteins). With the original ADAM, the correct docking modes (RMSD < 2.0 A) were not present among the top 20 models in one case of redocking and four cases of foreign-docking. When the handling of induced-fit motion was implemented, the correct solutions were acquired in all 40 test cases. In foreign-docking on thymidine kinase, the correct docking modes were obtained as the top-ranked solutions for all 10 test ligands by our combinatorial approach, and this appears to be the best result ever reported with any docking tool. The results of docking validation have thus confirmed the effectiveness of our approach, which can provide reliable docking models even in the case of foreign-docking, where conformational change of the target protein cannot be ignored. We expect that this approach will contribute substantially to actual drug design, including virtual screening.  相似文献   

15.
Aquatic organisms and, in particular, filter feeders, such as mussels, are continuously exposed to toxicants dissolved in the water and, presumably, require adaptations to avoid the detrimental effects from such chemicals. Previous work indicates that activity of ATP-binding cassette (ABC) transporters protects mussels against toxicants, but the nature of these transporters and the structural basis of protection are not known. Here we meld studies on transporter function, gene expression, and localization of transporter protein in mussel gill tissue and show activity and expression of two xenobiotic transporter types in the gills, where they provide an effective structural barrier against chemicals. Activity of ABCB/MDR/P-glycoprotein and ABCC/MRP-type transporters was indicated by sensitivity of efflux of the test substrate calcein-AM to the ABCB inhibitor PSC-833 and the ABCC inhibitor MK-571. This activity profile is supported by our cloning of the complete sequence of two ABC transporter types from RNA in mussel tissue with a high degree of identity to transporters from the ABCB and ABCC subfamilies. Overall identity of the amino acid sequences with corresponding homologs from other organisms was 38-50% (ABCB) and 27-44% (ABCC). C219 antibody staining specific for ABCB revealed that this transporter was restricted to cells in the gill filaments with direct exposure to water flow. Taken together, our data demonstrate that ABC transporters form an active, physiological barrier at the tissue-environment interface in mussel gills, providing protection against environmental xenotoxicants.  相似文献   

16.
Neurospora crassa has been the model filamentous fungus for the study of many fundamental cellular mechanisms of transport and metabolism. The recently completed genome sequence of N. crassa has over 10,000 genes without significant matches for a large number of genes (41%) in the sequence databases, indeed presents many challenges for new discoveries. Using transporter database and BLAST searches a total of 65 open reading frames for putative cation transporter genes have been identified in N. crassa. These were further confirmed by characteristic features of the family like transmembrane domains (TOPPRED 2), conserved motifs (Clustal W) and phylogenetic analysis (TREETOP). In Neurospora cation transporter genes constitute nearly 18.3% of the total membrane transport systems, which is higher than E. coli (8.8%), S. cerevisiae (13.7%), S. pombe (17.2%), A. fumigatus (10.1%), A. thaliana (16.8%) and H. sapiens (15.6%). We refer to the complete complement of metal ion transporter genes as "Metal Transportome". There are a total of 33 putative transporters for alkali and alkaline earth metals constituting 18 for calcium (P-ATPase, VIC, CaCA, Mid1), 7 for sodium (P-ATPase, CPA1, CPA2), 4 for potassium (Trk, VIC, KUP), and 4 for magnesium (MIT). Transition metal ion transporters account for 32 transporters including 7 for zinc (ZIP), 6 for copper (Ctr2, Ctr1), 2 each for manganese (Nramp), iron (OFeT), arsenite (ArsAB, ACR3) and other metal ions (ABC and P-ATPase) and 1 each for nickel (NiCoT) and chromate (CHR). N. crassa has 7 linkage groups of which LGI harbors 21 of metal ion transporters and in contrast LGVII has only 2. Studies on metal transportomes of different organisms will help to unravel the role of metal ion transporters in homeostasis.  相似文献   

17.
SUMMARY: Transporters are proteins that are involved in the movement of ions or molecules across biological membranes. Currently, our knowledge about the functions of transporters is limited due to the paucity of their 3D structures. Hence, computational techniques are necessary to annotate the functions of transporters. In this work, we focused on an important functional aspect of transporters, namely annotation of targets for transport proteins. We have systematically analyzed four major classes of transporters with different transporter targets: (i) electron, (ii) protein/mRNA, (iii) ion and (iv) others, using amino acid properties. We have developed a radial basis function network-based method for predicting transport targets with amino acid properties and position specific scoring matrix profiles. Our method showed a 10-fold cross-validation accuracy of 90.1, 80.1, 70.3 and 82.3% for electron transporters, protein/mRNA transporters, ion transporters and others, respectively, in a dataset of 543 transporters. We have also evaluated the performance of the method with an independent dataset of 108 proteins and we obtained similar accuracy. We suggest that our method could be an effective tool for functional annotation of transport proteins. AVAILABILITY: http://rbf.bioinfo.tw/~sachen/ttrbf.html  相似文献   

18.
The method for virus titer determination of avian infectious bursal disease (IBD) live vaccine, developed long before regulatory validation guidelines is a cell culture based biological assay intended for use in vaccine release testing.The aim of our study was to perform a validation, based on fit-for-purpose principle, of an old 50% tissue culture infectious dose (TCID50) method according to Guidelines of the International Cooperation on Harmonization of Technical Requirements for Registration of Veterinary Medicinal Products (VICH).This paper addresses challenges and discusses some key aspects that should be considered when validating biological methods. A different statistical approach and non-parametric statistics was introduced in validation protocol in order to derive useful information from experimental data. This approach is applicable for a wide range of methods.In conclusion, the previous virus titration method had showed to be precise, accurate, linear, robust and in accordance with current regulatory standards, which indicates that there is no need for additional re-development or upgrades of the method for its suitability for intended use.  相似文献   

19.
20.
We consider the problem of similarity queries in biological network databases. Given a database of networks, similarity query returns all the database networks whose similarity (i.e. alignment score) to a given query network is at least a specified similarity cutoff value. Alignment of two networks is a very costly operation, which makes exhaustive comparison of all the database networks with a query impractical. To tackle this problem, we develop a novel indexing method, named RINQ (Reference-based Indexing for Biological Network Queries). Our method uses a set of reference networks to eliminate a large portion of the database quickly for each query. A reference network is a small biological network. We precompute and store the alignments of all the references with all the database networks. When our database is queried, we align the query network with all the reference networks. Using these alignments, we calculate a lower bound and an approximate upper bound to the alignment score of each database network with the query network. With the help of upper and lower bounds, we eliminate the majority of the database networks without aligning them to the query network. We also quickly identify a small portion of these as guaranteed to be similar to the query. We perform pairwise alignment only for the remaining networks. We also propose a supervised method to pick references that have a large chance of filtering the unpromising database networks. Extensive experimental evaluation suggests that (i) our method reduced the running time of a single query on a database of around 300 networks from over 2 days to only 8 h; (ii) our method outperformed the state of the art method Closure Tree and SAGA by a factor of three or more; and (iii) our method successfully identified statistically and biologically significant relationships across networks and organisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号