期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins

Davide?Baú Alberto?JM?Martin Catherine?Mooney Alessandro?Vullo Ian?Walsh Gianluca?Pollastri Email author 《BMC bioinformatics》2006,7(1):402

Background

We describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, non-redundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of C_αtraces for short proteins (up to 200 amino acids). 相似文献

2.

Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks

Ian Walsh Davide Baù Alberto JM Martin Catherine Mooney Alessandro Vullo Gianluca Pollastri 《BMC structural biology》2009,9(1):5-20

Background

Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure.

Results

We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that C_αtrace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of C_αtraces based on both ab initio and template-based 4-class map predictions, showing that the latter are generally more accurate even when homology is dubious.

Conclusion

Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url http://distill.ucd.ie/. 相似文献

3.

Better prediction of protein contact number using a support vector regression analysis of amino acid sequence

Zheng?Yuan Email author 《BMC bioinformatics》2005,6(1):248

Background

Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C_βatoms in other residues within a sphere around the C_βatom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. 相似文献

4.

A molecular recombination map of <Emphasis Type="Italic">Antirrhinum majus</Emphasis>

Zsuzsanna Schwarz-Sommer Thomas Gübitz Julia Weiss Perla Gómez-di-Marco Luciana Delgado-Benarroch Andrew Hudson Marcos Egea-Cortines 《BMC plant biology》2010,10(1):275

Background

Genetic recombination maps provide important frameworks for comparative genomics, identifying gene functions, assembling genome sequences and for breeding. The molecular recombination map currently available for the model eudicot Antirrhinum majus is the result of a cross with Antirrhinum molle, limiting its usefulness within A. majus. 相似文献

5.

Exploiting structural and topological information to improve prediction of RNA-protein binding sites

Stefan R Maetschke Zheng Yuan 《BMC bioinformatics》2009,10(1):341

Background

RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy. 相似文献

6.

PhyME: A probabilistic algorithm for finding motifs in sets of orthologous sequences

Saurabh?Sinha Email author Mathieu?Blanchette Martin?Tompa 《BMC bioinformatics》2004,5(1):170

相似文献

7.

Ab initio modeling of small proteins by iterative TASSER simulations 总被引：1，自引：0，他引：1

Sitao Wu Jeffrey Skolnick Yang Zhang 《BMC biology》2007,5(1):17-10

Background

Predicting 3-dimensional protein structures from amino-acid sequences is an important unsolved problem in computational structural biology. The problem becomes relatively easier if close homologous proteins have been solved, as high-resolution models can be built by aligning target sequences to the solved homologous structures. However, for sequences without similar folds in the Protein Data Bank (PDB) library, the models have to be predicted from scratch. Progress in the ab initio structure modeling is slow. The aim of this study was to extend the TASSER (threading/assembly/refinement) method for the ab initio modeling and examine systemically its ability to fold small single-domain proteins. 相似文献

8.

Amino acid empirical contact energy definitions for fold recognition in the space of contact maps

Marco?Berrera Henriette?Molinari Federico?Fogolari Email author 《BMC bioinformatics》2003,4(1):8

Background

Contradicting evidence has been presented in the literature concerning the effectiveness of empirical contact energies for fold recognition. Empirical contact energies are calculated on the basis of information available from selected protein structures, with respect to a defined reference state, according to the quasi-chemical approximation. Protein-solvent interactions are estimated from residue solvent accessibility. 相似文献

9.

Genetic analysis of hybridization and introgression between wild mongoose and brown lemurs

Jennifer Pastorini Alphonse Zaramody Deborah J Curtis Caroline M Nievergelt Nicholas I Mundy 《BMC evolutionary biology》2009,9(1):32

Background

Hybrid zones generally represent areas of secondary contact after speciation. The nature of the interaction between genes of individuals in a hybrid zone is of interest in the study of evolutionary processes. In this study, data from nuclear microsatellites and mitochondrial DNA sequences were used to genetically characterize hybridization between wild mongoose lemurs (Eulemur mongoz) and brown lemurs (E. fulvus) at Anjamena in west Madagascar. 相似文献

10.

Interference with histidyl-tRNA synthetase by a CRISPR spacer sequence as a factor in the evolution of <Emphasis Type="Italic">Pelobacter carbinolicus</Emphasis>

Muktak Aklujkar Derek R Lovley 《BMC evolutionary biology》2010,10(1):230

Background

Pelobacter carbinolicus, a bacterium of the family Geobacteraceae, cannot reduce Fe(III) directly or produce electricity like its relatives. How P. carbinolicus evolved is an intriguing problem. The genome of P. carbinolicus contains clustered regularly interspaced short palindromic repeats (CRISPR) separated by unique spacer sequences, which recent studies have shown to produce RNA molecules that interfere with genes containing identical sequences. 相似文献

11.

Subfamily specific conservation profiles for proteins based on n-gram patterns

John K Vries Xiong Liu 《BMC bioinformatics》2008,9(1):72

Background

A new algorithm has been developed for generating conservation profiles that reflect the evolutionary history of the subfamily associated with a query sequence. It is based on n-gram patterns (NP{n,m}) which are sets of n residues and m wildcards in windows of size n+m. The generation of conservation profiles is treated as a signal-to-noise problem where the signal is the count of n-gram patterns in target sequences that are similar to the query sequence and the noise is the count over all target sequences. The signal is differentiated from the noise by applying singular value decomposition to sets of target sequences rank ordered by similarity with respect to the query. 相似文献

12.

EXMOTIF: efficient structured motif extraction

Yongqiang Zhang Mohammed J Zaki 《Algorithms for molecular biology : AMB》2006,1(1):21-18

Background

Extracting motifs from sequences is a mainstay of bioinformatics. We look at the problem of mining structured motifs, which allow variable length gaps between simple motif components. We propose an efficient algorithm, called EXMOTIF, that given some sequence(s), and a structured motif template, extracts all frequent structured motifs that have quorum q. Potential applications of our method include the extraction of single/composite regulatory binding sites in DNA sequences. 相似文献

13.

Transmission ratio distortion results in asymmetric introgression in Louisiana Iris

Shunxue Tang Rebecca A Okashah Steven J Knapp Michael L Arnold Noland H Martin 《BMC plant biology》2010,10(1):48

Background

Linkage maps are useful tools for examining both the genetic architecture of quantitative traits and the evolution of reproductive incompatibilities. We describe the generation of two genetic maps using reciprocal interspecific backcross 1 (BC₁) mapping populations from crosses between Iris brevicaulis and Iris fulva. These maps were constructed using expressed sequence tag (EST)- derived codominant microsatellite markers. Such a codominant marker system allowed for the ability to link the two reciprocal maps, and compare patterns of transmission ratio distortion observed between the two. 相似文献

14.

The contrasting properties of conservation and correlated phylogeny in protein functional residue prediction

Jonathan R Manning Emily R Jefferson Geoffrey J Barton 《BMC bioinformatics》2008,9(1):51

Background

Amino acids responsible for structure, core function or specificity may be inferred from multiple protein sequence alignments where a limited set of residue types are tolerated. The rise in available protein sequences continues to increase the power of techniques based on this principle. 相似文献

15.

A SSR-based composite genetic linkage map for the cultivated peanut (<Emphasis Type="Italic">Arachis hypogaea</Emphasis> L.) genome

Yanbin Hong Xiaoping Chen Xuanqiang Liang Haiyan Liu Guiyuan Zhou Shaoxiong Li Shijie Wen C Corley Holbrook Baozhu Guo 《BMC plant biology》2010,10(1):17

Background

The construction of genetic linkage maps for cultivated peanut (Arachis hypogaea L.) has and continues to be an important research goal to facilitate quantitative trait locus (QTL) analysis and gene tagging for use in a marker-assisted selection in breeding. Even though a few maps have been developed, they were constructed using diploid or interspecific tetraploid populations. The most recently published intra-specific map was constructed from the cross of cultivated peanuts, in which only 135 simple sequence repeat (SSR) markers were sparsely populated in 22 linkage groups. The more detailed linkage map with sufficient markers is necessary to be feasible for QTL identification and marker-assisted selection. The objective of this study was to construct a genetic linkage map of cultivated peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank. 相似文献

16.

SMOTIF: efficient structured pattern and profile motif search

Yongqiang Zhang Mohammed J Zaki 《Algorithms for molecular biology : AMB》2006,1(1):22-24

Background

A structured motif allows variable length gaps between several components, where each component is a simple motif, which allows either no gaps or only fixed length gaps. The motif can either be represented as a pattern or a profile (also called positional weight matrix). We propose an efficient algorithm, called SMOTIF, to solve the structured motif search problem, i.e., given one or more sequences and a structured motif, SMOTIF searches the sequences for all occurrences of the motif. Potential applications include searching for long terminal repeat (LTR) retrotransposons and composite regulatory binding sites in DNA sequences. 相似文献

17.

A comparative map viewer integrating genetic maps for <Emphasis Type="Italic">Brassica</Emphasis> and <Emphasis Type="Italic">Arabidopsis</Emphasis>

Geraldine AC Lim Erica G Jewell Xi Li Timothy A Erwin Christopher Love Jacqueline Batley German Spangenberg David Edwards 《BMC plant biology》2007,7(1):40

Background

Molecular genetic maps provide a means to link heritable traits with underlying genome sequence variation. Several genetic maps have been constructed for Brassica species, yet to date, there has been no simple means to compare this information or to associate mapped traits with the genome sequence of the related model plant, Arabidopsis. 相似文献

18.

Universal partitioning of the hierarchical fold network of 50-residue segments in proteins

Jun-ichi Ito Yuki Sonobe Kazuyoshi Ikeda Kentaro Tomii Junichi Higo 《BMC structural biology》2009,9(1):34

Background

Several studies have demonstrated that protein fold space is structured hierarchically and that power-law statistics are satisfied in relation between the numbers of protein families and protein folds (or superfamilies). We examined the internal structure and statistics in the fold space of 50 amino-acid residue segments taken from various protein folds. We used inter-residue contact patterns to measure the tertiary structural similarity among segments. Using this similarity measure, the segments were classified into a number (K _c) of clusters. We examined various K _c values for the clustering. The special resolution to differentiate the segment tertiary structures increases with increasing K _c. Furthermore, we constructed networks by linking structurally similar clusters. 相似文献

19.

JACOP: A simple and robust method for the automated classification of protein sequences with modular architecture

Peter?Sperisen Email author Marco?Pagni 《BMC bioinformatics》2005,6(1):216

Background

Whole-genome sequencing projects are rapidly producing an enormous number of new sequences. Consequently almost every family of proteins now contains hundreds of members. It has thus become necessary to develop tools, which classify protein sequences automatically and also quickly and reliably. The difficulty of this task is intimately linked to the mechanism by which protein sequences diverge, i.e. by simultaneous residue substitutions, insertions and/or deletions and whole domain reorganisations (duplications/swapping/fusion). 相似文献

20.

A simple dependence between protein evolution rate and the number of protein-protein interactions

Hunter?B?Fraser Email author Dennis?P?Wall Aaron?E?Hirsh 《BMC evolutionary biology》2003,3(1):11

Background

It has been shown for an evolutionarily distant genomic comparison that the number of protein-protein interactions a protein has correlates negatively with their rates of evolution. However, the generality of this observation has recently been challenged. Here we examine the problem using protein-protein interaction data from the yeast Saccharomyces cerevisiae and genome sequences from two other yeast species. 相似文献