首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Uncertainty and inconsistency of gene structure annotation remain limitations on research in the genome era, frustrating both biologists and bioinformaticians, who have to sort out annotation errors for their genes of interest or to generate trustworthy datasets for algorithmic development. It is unrealistic to hope for better software solutions in the near future that would solve all the problems. The issue is all the more urgent with more species being sequenced and analyzed by comparative genomics - erroneous annotations could easily propagate, whereas correct annotations in one species will greatly facilitate annotation of novel genomes. We propose a dynamic, economically feasible solution to the annotation predicament: broad-based, web-technology-enabled community annotation, a prototype of which is now in use for Arabidopsis.  相似文献   

2.
Since the first application of context-free grammars to RNA secondary structures in 1988, many researchers have used both ad hoc and formal methods from computational linguistics to model RNA and protein structure. We show how nearly all of these methods are based on the same core principles and can be converted into equivalent approaches in the framework of tree-adjoining grammars and related formalisms. We also propose some new approaches that extend these core principles in novel ways.  相似文献   

3.
4.
Many raw biological sequence data have been generated by the human genome project and related efforts. The understanding of structural information encoded by biological sequences is important to acquire knowledge of their biochemical functions but remains a fundamental challenge. Recent interest in RNA regulation has resulted in a rapid growth of deposited RNA secondary structures in varied databases. However, a functional classification and characterization of the RNA structure have only been partially addressed. This article aims to introduce a novel interval-based distance metric for structure-based RNA function assignment. The characterization of RNA structures relies on distance vectors learned from a collection of predicted structures. The distance measure considers the intersected, disjoint, and inclusion between intervals. A set of RNA pseudoknotted structures with known function are applied and the function of the query structure is determined by measuring structure similarity. This not only offers sequence distance criteria to measure the similarity of secondary structures but also aids the functional classification of RNA structures with pesudoknots.  相似文献   

5.
6.
Rich information on point mutation studies is scattered across heterogeneous data sources. This paper presents an automated workflow for mining mutation annotations from full-text biomedical literature using natural language processing (NLP) techniques as well as for their subsequent reuse in protein structure annotation and visualization. This system, called mSTRAP (Mutation extraction and STRucture Annotation Pipeline), is designed for both information aggregation and subsequent brokerage of the mutation annotations. It facilitates the coordination of semantically related information from a series of text mining and sequence analysis steps into a formal OWL-DL ontology. The ontology is designed to support application-specific data management of sequence, structure, and literature annotations that are populated as instances of object and data type properties. mSTRAPviz is a subsystem that facilitates the brokerage of structure information and the associated mutations for visualization. For mutated sequences without any corresponding structure available in the Protein Data Bank (PDB), an automated pipeline for homology modeling is developed to generate the theoretical model. With mSTRAP, we demonstrate a workable system that can facilitate automation of the workflow for the retrieval, extraction, processing, and visualization of mutation annotations -- tasks which are well known to be tedious, time-consuming, complex, and error-prone. The ontology and visualization tool are available at (http://datam.i2r.a-star.edu.sg/mstrap).  相似文献   

7.
Three-dimensional structure determination of macromolecules and macromolecular complexes is an integral part of understanding biological functions. For large protein and macromolecular complexes structure determination is often performed using electron cryomicroscopy where projection images of individual macromolecular complexes are combined to produce a three-dimensional reconstruction. Single particle methods have been devised to perform this structure determination for macromolecular complexes with little or no underlying symmetry. These computational methods generally involve an iterative process of aligning unique views of the macromolecular images followed by determination of the angular components that define those views. In this review, this structure determination process is described with the aim of clarifying a seemingly complex structural method.  相似文献   

8.
9.
10.
MOTIVATION: Detecting genes in viral genomes is a complex task. Due to the biological necessity of them being constrained in length, RNA viruses in particular tend to code in overlapping reading frames. Since one amino acid is encoded by a triplet of nucleic acids, up to three genes may be coded for simultaneously in one direction. Conventional hidden Markov model (HMM)-based gene-finding algorithms may typically find it difficult to identify multiple coding regions, since in general their topologies do not allow for the presence of overlapping or nested genes. Comparative methods have therefore been restricted to likelihood ratio tests on potential regions as to being double or single coding, using the fact that the constrictions forced upon multiple-coding nucleotides will result in atypical sequence evolution. Exploiting these same constraints, we present an HMM based gene-finding program, which allows for coding in unidirectional nested and overlapping reading frames, to annotate two homologous aligned viral genomes. Our method does not insist on conserved gene structure between the two sequences, thus making it applicable for the pairwise comparison of more distantly related sequences. RESULTS: We apply our method to 15 pairwise alignments of six different HIV2 genomes. Given sufficient evolutionary distance between the two sequences, we achieve sensitivity of approximately 84-89% and specificity of approximately 97-99.9%. We additionally annotate three pairwise alignments of the more distantly related HIV1 and HIV2, as well as of two different hepatitis viruses, attaining results of approximately 87% sensitivity and approximately 98.5% specificity. We subsequently incorporate prior knowledge by 'knowing' the gene structure of one sequence and annotating the other conditional on it. Boosting accuracy close to perfect we demonstrate that conservation of gene structure on top of nucleotide sequence is a valuable source of information, especially in distantly related genomes. AVAILABILITY: The Java code is available from the authors.  相似文献   

11.
12.
13.
14.
15.
An increasing number of ion channel structures are being determined. This generates a need for computational tools to enable functional annotation of channel structures. However, several studies of ion channel and model pores have indicated that the physical dimensions of a pore are not always a reliable indicator of its conductive status. This is due to the unusual behavior of water within nano-confined spaces, resulting in a phenomenon referred to as “hydrophobic gating”. We have recently demonstrated how simulating the behavior of water within an ion channel pore can be used to predict its conductive status. In this addendum to our study, we apply this method to compare the recently solved structure of a mutant of the bestrophin chloride channel BEST1 with that of the wild-type channel. Our results support the hypothesis of a hydrophobic gate within the narrow neck of BEST1. This provides further validation that this simulation approach provides the basis for an accurate and computationally efficient tool for the functional annotation of ion channel structures.  相似文献   

16.
New methods, essentially based on hidden Markov models (HMM) and neural networks (NN), can predict the topography of both beta-barrel and all-alpha membrane proteins with high accuracy and a low rate of false positives and false negatives. These methods have been integrated in a suite of programs to filter proteomes of Gram-negative bacteria, searching for new membrane proteins.  相似文献   

17.
While glycoproteins are abundant in nature, and changes in glycosylation occur in cancer and other diseases, glycoprotein characterization remains a challenge due to the structural complexity of the biopolymers. This paper presents a general strategy, termed GlyDB, for glycan structure annotation of N-linked glycopeptides from tandem mass spectra in the LC-MS analysis of proteolytic digests of glycoproteins. The GlyDB approach takes advantage of low-energy collision-induced dissociation of N-linked glycopeptides that preferentially cleaves the glycosidic bonds while the peptide backbone remains intact. A theoretical glycan structure database derived from biosynthetic rules for N-linked glycans was constructed employing a novel representation of branched glycan structures consisting of multiple linear sequences. The commonly used peptide identification program, Sequest, could then be utilized to assign experimental tandem mass spectra to individual glycoforms. Analysis of synthetic glycopeptides and well-characterized glycoproteins demonstrate that the GlyDB approach can be a useful tool for annotation of glycan structures and for selection of a limited number of potential glycan structure candidates for targeted validation.  相似文献   

18.
The confinement of macromolecules within enclosures or "pores" of comparable dimensions results in significant size- and shape-dependent alterations of macromolecular chemical potential and reactivity. Calculations of the magnitude of this effect for model particles of different shapes in model enclosures of different shapes were carried out using hard particle partition theory developed by Giddings et al. (J. Phys. Chem. 1968. 72:4397-4408). Results obtained indicate that the equilibrium constants of reactions, such as isomerization, self-association, and site binding, that result in significant change in macromolecular size, shape, and/or mobility may be altered within pores by as much as several orders of magnitude relative to the value in the unbounded or bulk phase. Confinement also produces a substantial size-dependent outward force on the walls of an enclosure. These results are likely to be important within the fluid phase of biological media, such as the cytoplasm of eukaryotic cells, containing significant volume fractions of large fibrous structures (e.g., the cytomatrix).  相似文献   

19.
For lintners with negligible amylose retrogradation, crystallinity related inversely to starch amylose content and, irrespective of starch source, incomplete removal of amorphous material was shown. The latter was more pronounced for B-type than for A-type starches. The two predominant lintner populations, with modal degrees of polymerization (DP) of 13-15 and 23-27, were best resolved for amylose-deficient and A-type starches. Results indicate a more specific hydrolysis of amorphous lamellae in such starches. Small-angle X-ray scattering showed a more intense 9-nm scattering peak for native amylose-deficient A-type starches than for their regular or B-type analogues. The experimental evidence indicates a lower contrasting density within the "crystalline" shells of the latter starches. A higher density in the amorphous lamellae, envisaged by the lamellar helical model, explains the relative acid resistance of linear amylopectin chains with DP > 20, observed in lintners of B-type starches. Because amylopectin chain length distributions were similar for regular and amylose-deficient starches of the same crystal type, we deduce that the more dense (and ordered) packing of double helices into lamellar structures in amylose-deficient starches is due to a different amylopectin branching pattern.  相似文献   

20.
SUMMARY: An object metamodel based on a standard scientific ontology has been developed and used to generate a CORBA interface, an SQL schema and an XML representation for macromolecular structure (MMS) data. In addition to the interface and schema definitions, the metamodel was also used to generate the core elements of a CORBA reference server and a JDBC database loader. The Java source code which implements this metamodel, the CORBA server, database loader and XML converter along with detailed documentation and code examples are available as part of the OpenMMS toolkit. AVAILABILITY: http://openmms.sdsc.edu CONTACT: dsg@sdsc.edu  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号