首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
MOTIVATION: Discovery of binding sites is important in the study of protein-protein interactions. In this paper, we introduce stable and significant motif pairs to model protein-binding sites. The stability is the pattern's resistance to some transformation. The significance is the unexpected frequency of occurrence of the pattern in a sequence dataset comprising known interacting protein pairs. Discovery of stable motif pairs is an iterative process, undergoing a chain of changing but converging patterns. Determining the starting point for such a chain is an interesting problem. We use a protein complex dataset extracted from the Protein Data Bank to help in identifying those starting points, so that the computational complexity of the problem is much released. RESULTS: We found 913 stable motif pairs, of which 765 are significant. We evaluated these motif pairs using comprehensive comparison results against random patterns. Wet-experimentally discovered motifs reported in the literature were also used to confirm the effectiveness of our method. SUPPLEMENTARY INFORMATION: http://sdmc.i2r.a-star.edu.sg/BindingMotifPairs.  相似文献   

SUMMARY: Dragon Promoter Mapper (DPM) is a tool to model promoter structure of co-regulated genes using methodology of Bayesian networks. DPM exploits an exhaustive set of motif features (such as motif, its strand, the order of motif occurrence and mutual distance between the adjacent motifs) and generates models from the target promoter sequences, which may be used to (1) detect regions in a genomic sequence which are similar to the target promoters or (2) to classify other promoters as similar or not to the target promoter group. DPM can also be used for modelling of enhancers and silencers. AVAILABILITY: http://defiant.i2r.a-star.edu.sg/projects/BayesPromoter/ CONTACT: vlad@sanbi.ac.za SUPPLEMENTARY INFORMATION: Manual for using DPM web server is provided at http://defiant.i2r.a-star.edu.sg/projects/BayesPromoter/html/manual/manual.htm.  相似文献   

SUMMARY: A high throughput Basic Local Alignment Search Tool (BLAST) system based on Web services is implemented. It provides an alternative BLAST service and allows users to perform multiple BLAST queries at one run in a distributed, parallel environment through the Internet. AVAILABILITY: It is available at http://mammoth.bii.a-star.edu.sg/webservices/htblast/index.html and at http://www.bii.a-star.edu.sg/jiren/download.html  相似文献   

Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) is a new technology to study genome-wide long-range chromatin interactions bound by protein factors. Here we present ChIA-PET Tool, a software package for automatic processing of ChIA-PET sequence data, including linker filtering, mapping tags to reference genomes, identifying protein binding sites and chromatin interactions, and displaying the results on a graphical genome browser. ChIA-PET Tool is fast, accurate, comprehensive, user-friendly, and open source (available at http://chiapet.gis.a-star.edu.sg).  相似文献   

The B-cell Epitope Interaction Database (BEID; http://datam.i2r.a-star.edu.sg/BEID) is an open-access database describing sequence-structure-function information on immunoglobulin (Ig)-antigen interactions. The current version of the database contains 164 antigens, 126 Ig and 189 Ig-antigen complexes extracted from the Protein Data Bank (PDB). Each entry is manually verified, classified, and analyzed for intermolecular interactions between antigens and the corresponding bound Ig molecules. Ig-antigen interaction information that is stored in BEID includes solvent accessibility, hydrogen bonds, non-hydrogen bonds, gap volume, gap index, interface area and contact residues. The database can be searched with a user-friendly search tool and schematic diagrams for Ig-antigen interactions are available for download in PDF format. The ultimate purpose of BEID is to enhance the understanding of the rules of engagement between antigen and the corresponding bound Ig molecules. It is also a precious data source for developing computational predictors for B-cell epitopes.  相似文献   

Databases and computational tools are increasingly important in the study of allergies, particularly in the assessment of allergenicity and allergic cross-reactivity. ALLERDB database contains sequences of allergens and information on reported cross-reactivity between allergens. It focuses on analysis of allergenicity and allergic cross-reactivity of clinically relevant protein allergens. The official IUIS allergen data were extracted from the IUIS Allergen Nomenclature Sub-Committee website, and their sequence information from the public databases, and reference publications. The analysis tools assist allergen data analysis and retrieval, and include keyword searching, BLAST, prediction of allergenicity, modification of BLAST that displays cross-reactive allergens, and graphics representation of cross-reactivity data. ALLERDB is new brand of allergen databases with a rich set of tools for sequence comparison, pattern identification, and visualization of results. It is accessible at http://research.i2r.a-star.edu.sg/Templar/DB/Allergen.  相似文献   

SUMMARY: DNAFSMiner (DNA Functional Sites Miner) is a web-based software toolbox to recognize functional sites in nucleic acid sequences. Currently in this toolbox, we provide two software: TIS Miner and Poly(A) Signal Miner. The TIS Miner can be used to predict translation initiation sites in vertebrate DNA/mRNA/cDNA sequences, and the Poly(A) Signal Miner can be used to predict polyadenylation [poly(A)] signals in human DNA sequences. The prediction results are better than those by literature methods on two benchmark applications. This good performance is mainly attributable to our unique learning method. DNAFSMiner is available free of charge for academic and non-profit organizations. AVAILABILITY: http://research.i2r.a-star.edu.sg/DNAFSMiner/ CONTACT: huiqing@i2r.a-star.edu.sg.  相似文献   

Data on the major histocompatibility complex, T-cell epitopes, B-cell epitopes, antigens and diseases are heterogeneous and scattered among different databases and the literature. Since it has become increasingly difficult to obtain an integrated view of functional immune response components, we have developed and updated over several years the Functional molecular IMMunology (FIMM) database (http:// research.i2r.a-star.edu.sg/fimm/). FIMM contains integrated expert-curated data on protein antigens, and on human immunological receptors that recognise and bind them in healthy or disease states. Interfaces with multiple, intuitive query options and query reports provide immunologists with prioritised information that aids data interpretation, vaccine target discovery and immune disease research.  相似文献   

MOTIVATION: Sequence annotations, functional and structural data on snake venom neurotoxins (svNTXs) are scattered across multiple databases and literature sources. Sequence annotations and structural data are available in the public molecular databases, while functional data are almost exclusively available in the published articles. There is a need for a specialized svNTXs database that contains NTX entries, which are organized, well annotated and classified in a systematic manner. RESULTS: We have systematically analyzed svNTXs and classified them using structure-function groups based on their structural, functional and phylogenetic properties. Using conserved motifs in each phylogenetic group, we built an intelligent module for the prediction of structural and functional properties of unknown NTXs. We also developed an annotation tool to aid the functional prediction of newly identified NTXs as an additional resource for the venom research community. AVAILABILITY: We created a searchable online database of NTX proteins sequences (http://research.i2r.a-star.edu.sg/Templar/DB/snake_neurotoxin). This database can also be found under Swiss-Prot Toxin Annotation Project website (http://www.expasy.org/sprot/).  相似文献   

WebAllergen is a web server that predicts the potential allergenicity of proteins. The query protein will be compared against a set of prebuilt allergenic motifs that have been obtained from 664 known allergen proteins. The query will also be compared with known allergens that do not have detectable allergenic motifs. Moreover, users are allowed to upload their own allergens as alternative training sequences on which a new set of allergenic motifs will be built. The query sequences can also be compared with these motifs. AVAILABILITY: http://weballergen.bii.a-star.edu.sg/  相似文献   


A modification to Phred and program to detect heterogeneous positions, which is particularly useful in the identification of mutations and other abnormalities in Phred/Phrap genome assemblies. AVAILABILITY: The package is made available at http://glscompute.gis.a-star.edu.sg/~charlie/DHetero.html  相似文献   

Protein complexes are key entities to perform cellular functions. Human diseases are also revealed to associate with some specific human protein complexes. In fact, human protein complexes are widely used for protein function annotation, inference of human protein interactome, disease gene prediction, and so on. Therefore, it is highly desired to build an up-to-date catalogue of human complexes to support the research in these applications. Protein complexes from different databases are as expected to be highly redundant. In this paper, we designed a set of concise operations to compile these redundant human complexes and built a comprehensive catalogue called CHPC2012 (Catalogue of Human Protein Complexes). CHPC2012 achieves a higher coverage for proteins and protein complexes than those individual databases. It is also verified to be a set of complexes with high quality as its co-complex protein associations have a high overlap with protein-protein interactions (PPI) in various existing PPI databases. We demonstrated two distinct applications of CHPC2012, that is, investigating the relationship between protein complexes and drug-related systems and evaluating the quality of predicted protein complexes. In particular, CHPC2012 provides more insights into drug development. For instance, proteins involved in multiple complexes (the overlapping proteins) are potential drug targets; the drug-complex network is utilized to investigate multi-target drugs and drug-drug interactions; and the disease-specific complex-drug networks will provide new clues for drug repositioning. With this up-to-date reference set of human protein complexes, we believe that the CHPC2012 catalogue is able to enhance the studies for protein interactions, protein functions, human diseases, drugs, and related fields of research. CHPC2012 complexes can be downloaded from http://www1.i2r.a-star.edu.sg/xlli/CHPC2012/CHPC2012.htm.  相似文献   

G-PRIMER, a web-based primer design program, has been developed to compute a minimal primer set specifically annealed to all the open reading frames in a given microbial genome. This program has been successfully used in the microarray experiment for analyzing the expression of genes in the Xanthomonas campestris genome. AVAILABILITY: It is available at http://mammoth.bii.a-star.edu.sg/gprimer/. Its source code is available upon request.  相似文献   

Han LY  Cai CZ  Ji ZL  Cao ZW  Cui J  Chen YZ 《Nucleic acids research》2004,32(21):6437-6444
The function of a protein that has no sequence homolog of known function is difficult to assign on the basis of sequence similarity. The same problem may arise for homologous proteins of different functions if one is newly discovered and the other is the only known protein of similar sequence. It is desirable to explore methods that are not based on sequence similarity. One approach is to assign functional family of a protein to provide useful hint about its function. Several groups have employed a statistical learning method, support vector machines (SVMs), for predicting protein functional family directly from sequence irrespective of sequence similarity. These studies showed that SVM prediction accuracy is at a level useful for functional family assignment. But its capability for assignment of distantly related proteins and homologous proteins of different functions has not been critically and adequately assessed. Here SVM is tested for functional family assignment of two groups of enzymes. One consists of 50 enzymes that have no homolog of known function from PSI-BLAST search of protein databases. The other contains eight pairs of homologous enzymes of different families. SVM correctly assigns 72% of the enzymes in the first group and 62% of the enzyme pairs in the second group, suggesting that it is potentially useful for facilitating functional study of novel proteins. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.  相似文献   

Zhou Y  Zhou YS  He F  Song J  Zhang Z 《Molecular bioSystems》2012,8(5):1396-1404
Deciphering functional interactions between proteins is one of the great challenges in biology. Sequence-based homology-free encoding schemes have been increasingly applied to develop promising protein-protein interaction (PPI) predictors by means of statistical or machine learning methods. Here we analyze the relationship between codon pair usage and PPIs in yeast. We show that codon pair usage of interacting protein pairs differs significantly from randomly expected. This motivates the development of a novel approach for predicting PPIs, with codon pair frequency difference as input to a Support Vector Machine predictor, termed as CCPPI. 10-fold cross-validation tests based on yeast PPI datasets with balanced positive-to-negative ratios indicate that CCPPI performs better than other sequence-based encoding schemes. Moreover, it ranks the best when tested on an unbalanced large-scale dataset. Although CCPPI is subjected to high false positive rates like many PPI predictors, statistical analyses of the predicted true positives confirm that the success of CCPPI is partly ascribed to its capability to capture proteomic co-expression and functional similarities between interacting protein pairs. Our findings suggest that codon pairs of interacting protein pairs evolve in a coordinated manner and consequently they provide additional information beyond amino acids-based encoding schemes. CCPPI has been made freely available at: http://protein.cau.edu.cn/ccppi.  相似文献   

SUMMARY: Eukaryotes have both 'intron containing' and 'intron less' genes. Several databases are available for 'intron containing' genes in eukaryotes. In this note, we describe a database for 'intron less' genes from eukaryotes. 'Intron less' eukaryotic genes having prokaryotic architecture will help to understand gene evolution in a much simpler way unlike 'intron containing' genes. AVAILABILITY: SEGE is available at http://intron.bic.nus.edu.sg/seg/ CONTACT: mmeena@ntu.edu.sg  相似文献   

MOTIVATION: Contrasts are useful conceptual vehicles for learning processes and exploratory research of the unknown. For example, contrastive information between proteins can reveal what similarities, divergences and relations there are of the two proteins, leading to invaluable insights for better understanding about the proteins. Such contrastive information are found to be reported in the biomedical literature. However, there have been no reported attempts in current biomedical text mining work that systematically extract and present such useful contrastive information from the literature for exploitation. RESULTS: Our BioContrasts system extracts protein-protein contrastive information from MEDLINE abstracts and presents the information to biologists in a web-application for exploitation. Contrastive information are identified in the text abstracts with contrastive negation patterns such as 'A but not B'. A total of 799 169 pairs of contrastive expressions were successfully extracted from 2.5 million MEDLINE abstracts. Using grounding of contrastive protein names to Swiss-Prot entries, we were able to produce 41 471 pieces of contrasts between Swiss-Prot protein entries. These contrastive pieces of information are then presented via a user-friendly interactive web portal that can be exploited for applications such as the refinement of biological pathways. AVAILABILITY: BioContrasts can be accessed at http://biocontrasts.i2r.a-star.edu.sg. It is also mirrored at http://biocontrasts.biopathway.org. SUPPLEMENTARY INFORMATION: Supplementary materials are available at Bioinformatics online.  相似文献   

Cellware--a multi-algorithmic software for computational systems biology   总被引:3,自引:0,他引:3  
The intracellular environment of a cell hosts a wide variety of enzymatic reactions, diffusion events, molecular binding, polymerization and metabolic channeling. To transform these biological events into a computational framework, distinct modeling strategies are required. While currently no tool is capable of capturing all these events, progress is being made to create an integrated environment for the modeling community. To address this niche requirement, Cellware has been developed to offer a multi-algorithmic environment for modeling and simulating both deterministic and stochastic events in the cell. AVAILABILITY: The software is available for free and can be downloaded from http://www.bii.a-star.edu.sg/sbg/cellware  相似文献   

Rich information on point mutation studies is scattered across heterogeneous data sources. This paper presents an automated workflow for mining mutation annotations from full-text biomedical literature using natural language processing (NLP) techniques as well as for their subsequent reuse in protein structure annotation and visualization. This system, called mSTRAP (Mutation extraction and STRucture Annotation Pipeline), is designed for both information aggregation and subsequent brokerage of the mutation annotations. It facilitates the coordination of semantically related information from a series of text mining and sequence analysis steps into a formal OWL-DL ontology. The ontology is designed to support application-specific data management of sequence, structure, and literature annotations that are populated as instances of object and data type properties. mSTRAPviz is a subsystem that facilitates the brokerage of structure information and the associated mutations for visualization. For mutated sequences without any corresponding structure available in the Protein Data Bank (PDB), an automated pipeline for homology modeling is developed to generate the theoretical model. With mSTRAP, we demonstrate a workable system that can facilitate automation of the workflow for the retrieval, extraction, processing, and visualization of mutation annotations -- tasks which are well known to be tedious, time-consuming, complex, and error-prone. The ontology and visualization tool are available at (http://datam.i2r.a-star.edu.sg/mstrap).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号