期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

3MOTIF: visualizing conserved protein sequence motifs in the protein structure database

Bennett SP Nevill-Manning CG Brutlag DL 《Bioinformatics (Oxford, England)》2003,19(4):541-542

SUMMARY: 3MOTIF is a web application that visually maps conserved sequence motifs onto three-dimensional protein structures in the Protein Data Bank (PDB; Berman et al., Nucleic Acids Res., 28, 235-242, 2000). Important properties of motifs such as conservation strength and solvent accessible surface area at each position are visually represented on the structure using a variety of color shading schemes. Users can manipulate the displayed motifs using the freely available Chime plugin. AVAILABILITY: http://motif.stanford.edu/3motif/ 相似文献

2.

The EMOTIF database

Huang JY Brutlag DL 《Nucleic acids research》2001,29(1):202-204

The EMOTIF database is a collection of more than 170 000 highly specific and sensitive protein sequence motifs representing conserved biochemical properties and biological functions. These protein motifs are derived from 7697 sequence alignments in the BLOCKS+ database (released on June 23, 2000) and all 8244 protein sequence alignments in the PRINTS database (version 27.0) using the emotif-maker algorithm developed by Nevill-Manning et al. (Nevill-Manning,C.G., Wu,T.D. and Brutlag,D.L. (1998) Proc. Natl Acad. Sci. USA, 95, 5865-5871; Nevill-Manning,C.G., Sethi,K.S., Wu,T. D. and Brutlag,D.L. (1997) ISMB-97, 5, 202-209). Since the amino acids and the groups of amino acids in these sequence motifs represent critical positions conserved in evolution, search algorithms employing the EMOTIF patterns can identify and classify more widely divergent sequences than methods based on global sequence similarity. The emotif protein pattern database is available at http://motif.stanford.edu/emotif/. 相似文献

3.

RNAMotifScanX: a graph alignment approach for RNA structural motif identification

Cuncong Zhong Shaojie Zhang 《RNA (New York, N.Y.)》2015,21(3):333-346

RNA structural motifs are recurrent three-dimensional (3D) components found in the RNA architecture. These RNA structural motifs play important structural or functional roles and usually exhibit highly conserved 3D geometries and base-interaction patterns. Analysis of the RNA 3D structures and elucidation of their molecular functions heavily rely on efficient and accurate identification of these motifs. However, efficient RNA structural motif search tools are lacking due to the high complexity of these motifs. In this work, we present RNAMotifScanX, a motif search tool based on a base-interaction graph alignment algorithm. This novel algorithm enables automatic identification of both partially and fully matched motif instances. RNAMotifScanX considers noncanonical base-pairing interactions, base-stacking interactions, and sequence conservation of the motifs, which leads to significantly improved sensitivity and specificity as compared with other state-of-the-art search tools. RNAMotifScanX also adopts a carefully designed branch-and-bound technique, which enables ultra-fast search of large kink-turn motifs against a 23S rRNA. The software package RNAMotifScanX is implemented using GNU C++, and is freely available from http://genome.ucf.edu/RNAMotifScanX. 相似文献

4.

A profile-based deterministic sequential Monte Carlo algorithm for motif discovery

Liang KC Wang X Anastassiou D 《Bioinformatics (Oxford, England)》2008,24(1):46-55

相似文献

5.

Phylomat: an automated protein motif analysis tool for phylogenomics 总被引：2，自引：0，他引：2

Graham WV Tcheng DK Shirk AL Attene-Ramos MS Welge ME Gaskins HR 《Journal of proteome research》2004,3(6):1289-1291

Recent progress in genomics, proteomics, and bioinformatics enables unprecedented opportunities to examine the evolutionary history of molecular, cellular, and developmental pathways through phylogenomics. Accordingly, we have developed a motif analysis tool for phylogenomics (Phylomat, http://alg.ncsa.uiuc.edu/pmat) that scans predicted proteome sets for proteins containing highly conserved amino acid motifs or domains for in silico analysis of the evolutionary history of these motifs/domains. Phylomat enables the user to download results as full protein or extracted motif/domain sequences from each protein. Tables containing the percent distribution of a motif/domain in organisms normalized to proteome size are displayed. Phylomat can also align the set of full protein or extracted motif/domain sequences and predict a neighbor-joining tree from relative sequence similarity. Together, Phylomat serves as a user-friendly data-mining tool for the phylogenomic analysis of conserved sequence motifs/domains in annotated proteomes from the three domains of life. 相似文献

6.

ARCS: an aggregated related column scoring scheme for aligned sequences

Song B Choi JH Chen G Szymanski J Zhang GQ Tung AK Kang J Kim S Yang J 《Bioinformatics (Oxford, England)》2006,22(19):2326-2332

MOTIVATION: Biologists frequently align multiple biological sequences to determine consensus sequences and/or search for predominant residues and conserved regions. Particularly, determining conserved regions in an alignment is one of the most important activities. Since protein sequences are often several-hundred residues or longer, it is difficult to distinguish biologically important conserved regions (motifs or domains) from others. The widely used tools, Logos, Al2co, Confind, and the entropy-based method, often fail to highlight such regions. Thus a computational tool that can highlight biologically important regions accurately will be highly desired. RESULTS: This paper presents a new scoring scheme ARCS (Aggregated Related Column Score) for aligned biological sequences. ARCS method considers not only the traditional character similarity measure but also column correlation. In an extensive experimental evaluation using 533 PROSITE patterns, ARCS is able to highlight the motif regions with up to 77.7% accuracy corresponding to the top three peaks. AVAILABILITY: The source code is available on http://bio.informatics.indiana.edu/projects/arcs and http://goldengate.case.edu/projects/arcs 相似文献

7.

D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

Naresh Sen Manoj Mishra Feroz Khan Abha Meena Ashok Sharma 《Bioinformation》2009,3(10):415-418

相似文献

8.

DAtA: database of Arabidopsis thaliana annotation 总被引：1，自引：0，他引：1

下载免费PDF全文

Palm CJ Federspiel NA Davis RW 《Nucleic acids research》2000,28(1):102-103

The Database of Arabidopsis thaliana Annotation (D At A) was created to enable easy access to and analysis of all the Arabidopsis genome project annotation. The database was constructed using the completed A.thaliana genomic sequence data currently in GenBank. An automated annotation process was used to predict coding sequences for GenBank records that do not include annotation. D At A also contains protein motifs and protein similarities derived from searches of the proteins in D At A with motif databases and the non-redundant protein database. The database is routinely updated to include new GenBank submissions for Arabidopsis genomic sequences and new Blast and protein motif search results. A web interface to D At A allows coding sequences to be searched by name, comment, blast similarity or motif field. In addition, browse options present lists of either all the protein names or identified motifs present in the sequenced A.thaliana genome. The database can be accessed at http://baggage. stanford.edu/group/arabprotein/ 相似文献

9.

A protein short motif search tool using amino acid sequence and their secondary structure assignment

Venkataraman A Chew TH Hussein ZA Shamsir MS 《Bioinformation》2011,7(6):304-306

We present the development of a web server, a protein short motif search tool that allows users to simultaneously search for a protein sequence motif and its secondary structure assignments. The web server is able to query very short motifs searches against PDB structural data from the RCSB Protein Databank, with the users defining the type of secondary structures of the amino acids in the sequence motif. The output utilises 3D visualisation ability that highlights the position of the motif in the structure and on the corresponding sequence. Researchers can easily observe the locations and conformation of multiple motifs among the results. Protein short motif search also has an application programming interface (API) for interfacing with other bioinformatics tools. AVAILABILITY: The database is available for free at http://birg3.fbb.utm.my/proteinsms. 相似文献

10.

Minimal-risk scoring matrices for sequence analysis.

T D Wu C G Nevill-Manning D L Brutlag 《Journal of computational biology》1999,6(2):219-235

We introduce a minimal-risk method for estimating the frequencies of amino acids at conserved positions in a protein family. Our method, called minimal-risk estimation, finds the optimal weighting between a set of observed amino acid counts and a set of pseudofrequencies, which represent prior information about the frequencies. We compute the optimal weighting by minimizing the expected distance between the estimated frequencies and the true population frequencies, measured by either a squared-error or a relative-entropy metric. Our method accounts for the source of the pseudofrequencies, which arise either from the background distribution of amino acids or from applying a substitution matrix to the observed data. Our frequency estimates therefore depend on the size and composition of the observed data as well as the source of the pseudofrequencies. We convert our frequency estimates into minimal-risk scoring matrices for sequence analysis. A large-scale cross-validation study, involving 48 variants of seven methods, shows that the best performing method is minimal-risk estimation using the squared-error metric. Our method is implemented in the package EMATRIX, which is available on the Internet at http://motif.stanford.edu/ematrix. 相似文献

11.

An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments 总被引：19，自引：0，他引：19

Liu XS Brutlag DL Liu JS 《Nature biotechnology》2002,20(8):835-839

相似文献

12.

Identifying property based sequence motifs in protein families and superfamilies: application to DNase-1 related endonucleases

Mathura VS Schein CH Braun W 《Bioinformatics (Oxford, England)》2003,19(11):1381-1390

MOTIVATION: Identification of short conserved sequence motifs common to a protein family or superfamily can be more useful than overall sequence similarity in suggesting the function of novel gene products. Locating motifs still requires expert knowledge, as automated methods using stringent criteria may not differentiate subtle similarities from statistical noise. RESULTS: We have developed a novel automatic method, based on patterns of conservation of 237 physical-chemical properties of amino acids in aligned protein sequences, to find related motifs in proteins with little or no overall sequence similarity. As an application, our web-server MASIA identified 12 property-based motifs in the apurinic/apyrimidinic endonuclease (APE) family of DNA-repair enzymes of the DNase-I superfamily. Searching with these motifs located distantly related representatives of the DNase-I superfamily, such as Inositol 5'-polyphosphate phosphatases in the ASTRAL40 database, using a Bayesian scoring function. Other proteins containing APE motifs had no overall sequence or structural similarity. However, all were phosphatases and/or had a metal ion binding active site. Thus our automated method can identify discrete elements in distantly related proteins that define local structure and aspects of function. We anticipate that our method will complement existing ones to functionally annotate novel protein sequences from genomic projects. AVAILABILITY: MASIA WEB site: http://www.scsb.utmb.edu/masia/masia.html SUPPLEMENTARY INFORMATION: The dendrogram of 42 APE sequences used to derive motifs is available on http://www.scsb.utmb.edu/comp_biol.html/DNA_repair/publication.html 相似文献

13.

A generic motif discovery algorithm for sequential data

Jensen KL Styczynski MP Rigoutsos I Stephanopoulos GN 《Bioinformatics (Oxford, England)》2006,22(1):21-28

MOTIVATION: Motif discovery in sequential data is a problem of great interest and with many applications. However, previous methods have been unable to combine exhaustive search with complex motif representations and are each typically only applicable to a certain class of problems. RESULTS: Here we present a generic motif discovery algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As we show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. Finally, Gemoda's output motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices or any number of other models for any type of sequential data. We demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids sequences, a new solution to the (l,d)-motif problem in DNA sequences and the discovery of conserved protein substructures. AVAILABILITY: Gemoda is freely available at http://web.mit.edu/bamel/gemoda 相似文献

14.

POAVIZ: a Partial order multiple sequence alignment visualizer

Grasso C Quist M Ke K Lee C 《Bioinformatics (Oxford, England)》2003,19(11):1446-1448

SUMMARY: POAVIZ creates a visualization of a multiple sequence alignment that makes clear the overall structure of how sequences match and diverge in the alignment. POAVIZ can construct visualizations from any multiple sequence alignment source (e.g. PIR and CLUSTAL formats), and is valuable for revealing complex branching structure (such as domains, large-scale insertions / deletions or recombinations), especially in partnership with the Partial Order Alignment (POA) multiple sequence alignment program. AVAILABILITY: The Partial Order multiple sequence Alignment Visualizer (POAVIZ) program is available at http://www.bioinformatics.ucla.edu/poa 相似文献

15.

Structure-aided prediction of mammalian transcription factor complexes in conserved non-coding elements

Harendra Guturu Andrew C. Doxey Aaron M. Wenger Gill Bejerano 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2013,368(1632)

相似文献

16.

ParSeq: searching motifs with structural and biochemical properties

Schmollinger M Fischer I Nerz C Pinkenburg S Götz F Kaufmann M Lange KJ Reuter R Rosenstiel W Zell A 《Bioinformatics (Oxford, England)》2004,20(9):1459-1461

相似文献

17.

Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

Anton I. Petrov Craig L. Zirbel Neocles B. Leontis 《RNA (New York, N.Y.)》2013,19(10):1327-1340

The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. 相似文献

18.

Protein multiple alignment incorporating primary and secondary structure information.

Nak-Kyeong Kim Jun Xie 《Journal of computational biology》2006,13(10):1735-1748

Identifying common local segments, also called motifs, in multiple protein sequences plays an important role for establishing homology between proteins. Homology is easy to establish when sequences are similar (sharing an identity > 25%). However, for distant proteins, it is much more difficult to align motifs that are not similar in sequences but still share common structures or functions. This paper is a first attempt to align multiple protein sequences using both primary and secondary structure information. A new sequence model is proposed so that the model assigns high probabilities not only to motifs that contain conserved amino acids but also to motifs that present common secondary structures. The proposed method is tested in a structural alignment database BAliBASE. We show that information brought by the predicted secondary structures greatly improves motif identification. A website of this program is available at www.stat.purdue.edu/~junxie/2ndmodel/sov.html. 相似文献

19.

qPMS7: A Fast Algorithm for Finding (ℓ, d)-Motifs in DNA and Protein Sequences

H Dinh S Rajasekaran J Davila 《PloS one》2012,7(7):e41425

相似文献

20.

Dragon Promoter Mapper (DPM): a Bayesian framework for modelling promoter structures

Chowdhary R Tan SL Ali RA Boerlage B Wong L Bajic VB 《Bioinformatics (Oxford, England)》2006,22(18):2310-2312

SUMMARY: Dragon Promoter Mapper (DPM) is a tool to model promoter structure of co-regulated genes using methodology of Bayesian networks. DPM exploits an exhaustive set of motif features (such as motif, its strand, the order of motif occurrence and mutual distance between the adjacent motifs) and generates models from the target promoter sequences, which may be used to (1) detect regions in a genomic sequence which are similar to the target promoters or (2) to classify other promoters as similar or not to the target promoter group. DPM can also be used for modelling of enhancers and silencers. AVAILABILITY: http://defiant.i2r.a-star.edu.sg/projects/BayesPromoter/ CONTACT: vlad@sanbi.ac.za SUPPLEMENTARY INFORMATION: Manual for using DPM web server is provided at http://defiant.i2r.a-star.edu.sg/projects/BayesPromoter/html/manual/manual.htm. 相似文献