共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
A novel bacterial gene-finding system with improved accuracy in locating start codons. 总被引:3,自引:0,他引:3
Although a number of bacterial gene-finding programs have been developed, there is still room for improvement especially in the area of correctly detecting translation start sites. We developed a novel bacterial gene-finding program named GeneHacker Plus. Like many others, it is based on a hidden Markov model (HMM) with duration. However, it is a 'local' model in the sense that the model starts from the translation control region and ends at the stop codon of a coding region. Multiple coding regions are identified as partial paths, like local alignments in the Smith-Waterman algorithm, regardless of how they overlap. Moreover, our semiautomatic procedure for constructing the model of the translation control region allows the inclusion of an additional conserved element as well as the ribosome-binding site. We confirmed that GeneHacker Plus is one of the most accurate programs in terms of both finding potential coding regions and precisely locating translation start sites. GeneHacker Plus is also equipped with an option where the results from database homology searches are directly embedded in the HMM. Although this option does not raise the overall predictability, labeled similarity information can be of practical use. GeneHacker Plus can be accessed freely at http://elmo.ims.u-tokyo.ac.jp/GH/. 相似文献
3.
Gene recognition by combination of several gene-finding programs 总被引:7,自引:1,他引:7
MOTIVATION: A number of programs have been developed to predict theeukaryotic gene structures in DNA sequences. However, gene finding is stilla challenging problem. RESULTS: We have explored the effectiveness when theresults of several gene-finding programs were re- analyzed and combined. Westudied several methods with four programs (FEXH, GeneParser3, GEN-SCAN andGRAIL2). By HIGHEST-policy combination method or BOUNDARY method,approximate correlation (AC) improved by 3- 5% in comparison with the bestsingle gene-finding program. From another viewpoint, OR-based combinationof the four programs is the most reliable to know whether a candidate exonoverlaps with the real exon or not, although it is less sensitive thanGENSCAN for exon-intron boundaries. Our methods can easily be extended tocombine other programs. AVAILABILITY: We have developed a server program(Shirokane System) and a client program (GeneScope) to use the methods.GeneScope is available through a WWW site (http://gf.genome.ad.jp/).CONTACT: katsu,takagi@ims.u-tokyo.ac.jp 相似文献
4.
Francisco M. Ortu?o Olga Valenzuela Hector Pomares Fernando Rojas Javier P. Florido Jose M. Urquiza Ignacio Rojas 《Nucleic acids research》2013,41(1):e26
Multiple sequence alignments (MSAs) have become one of the most studied approaches in bioinformatics to perform other outstanding tasks such as structure prediction, biological function analysis or next-generation sequencing. However, current MSA algorithms do not always provide consistent solutions, since alignments become increasingly difficult when dealing with low similarity sequences. As widely known, these algorithms directly depend on specific features of the sequences, causing relevant influence on the alignment accuracy. Many MSA tools have been recently designed but it is not possible to know in advance which one is the most suitable for a particular set of sequences. In this work, we analyze some of the most used algorithms presented in the bibliography and their dependences on several features. A novel intelligent algorithm based on least square support vector machine is then developed to predict how accurate each alignment could be, depending on its analyzed features. This algorithm is performed with a dataset of 2180 MSAs. The proposed system first estimates the accuracy of possible alignments. The most promising methodologies are then selected in order to align each set of sequences. Since only one selected algorithm is run, the computational time is not excessively increased. 相似文献
5.
6.
In this practice, with a family practitioner committee list of 9726 patients, we use a computer register for recall, screening, morbidity data, audit, and repeat prescribing. The computing techniques used to achieve accuracy in maintaining the register are described. After one year of full use the register was validated by using the computer to select a random sample of 200 patients from patients'' computer records that had not been updated recently. Two patients were untraceable, and in only 11 records were errors of information found, none of which was important. We think that it is feasible and valuable to have a household index. 相似文献
7.
MOTIVATION: Computational gene identification plays an important role in genome projects. The approaches used in gene identification programs are often tuned to one particular organism, and accuracy for one organism or class of organism does not necessarily translate to accurate predictions for other organisms. In this paper we evaluate five computer programs on their ability to locate coding regions and to predict gene structure in Neurospora crassa. One of these programs (FFG) was designed specifically for gene-finding in N.crassa, but the model parameters have not yet been fully 'tuned', and the program should thus be viewed as an initial prototype. The other four programs were neither designed nor tuned for N.crassa. RESULTS: We describe the data sets on which the experiments were performed, the approaches employed by the five algorithms: GenScan, HMMGene, GeneMark, Pombe and FFG, the methodology of our evaluation, and the results of the experiments. Our results show that, while none of the programs consistently performs well, overall the GenScan program has the best performance on sensitivity and Missing Exons (ME) while the HMMGene and FFG programs have good performance in locating the exons roughly. Additional work motivated by this study includes the creation of a tool for the automated evaluation of gene-finding programs, the collection of larger and more reliable data sets for N.crassa, parameterization of the model used in FFG to produce a more accurate gene-finding program for this species, and a more in-depth evaluation of the reasons that existing programs generally fail for N.crassa. AVAILABILITY: Data sets, the FFG program source code, and links to the other programs analyzed are available at http://jerry.cs.uga.edu/~wang/genefind.html. CONTACT: eileen@cs.uga.edu. 相似文献
8.
Baldi P Brunak S Chauvin Y Andersen CA Nielsen H 《Bioinformatics (Oxford, England)》2000,16(5):412-424
We provide a unified overview of methods that currently are widely used to assess the accuracy of prediction algorithms, from raw percentages, quadratic error measures and other distances, and correlation coefficients, and to information theoretic measures such as relative entropy and mutual information. We briefly discuss the advantages and disadvantages of each approach. For classification tasks, we derive new learning algorithms for the design of prediction systems by directly optimising the correlation coefficient. We observe and prove several results relating sensitivity and specificity of optimal systems. While the principles are general, we illustrate the applicability on specific problems such as protein secondary structure and signal peptide prediction. 相似文献
9.
Stephen H. Bryant 《Proteins》1996,26(2):172-185
Threading experiments with proteins from the globin family provide an indication of the nature of the structural similarity required for successful fold recognition and accurate sequence-structure alignment. Threading scores are found to rise above the noise of false positives whenever roughly 60% of residues from a sequence can be aligned with analogous sites in the structure of a remote homolog. Fold recognition specificity thus appears to be limited by the extent of structural similarity, regardless of the degree of sequence similarity. Threading alignment accuracy is found to depend more critically on the degree of structural similarity. Alignments are accurate, placing the majority of residues exactly as in structural alignment, only when superposition residuals are less than 2.5 Å. These criteria for successful recognition and sequence-structure alignment appear to be consistent with the successes and failures of threading methods in blind structure prediction. They also suggest a direct assay for improved threading methods: Potentials and alignment models should be tested for their ability to detect less extensive structural similarities, and to produce accurate alignments when superposition residuals for this conserved “core” fall in the range characteristic of remote homologs. © 1996 Wiley-Liss, Inc. 1 This article is a US Government work and, as such, is in the public domain in the United States of America. 相似文献
10.
Ehrig RM Heller MO Kratzenstein S Duda GN Trepczynski A Taylor WR 《Journal of biomechanics》2011,44(7):1400-1404
The determination of an accurate centre of rotation (CoR) from skin markers is essential for the assessment of abnormal gait patterns in clinical gait analysis. Despite the many functional approaches to estimate CoRs, no non-invasive analytical determination of the error in the reconstructed joint location is currently available. The purpose of this study was therefore to verify the residual of the symmetrical centre of rotation estimation (SCoRE) as a reliable indirect measure of the error of the computed joint centre. To evaluate the SCoRE residual, numerical simulations were performed to evaluate CoR estimations at different ranges of joint motion. A statistical model was developed and used to determine the theoretical relationships among the SCoRE residual, the magnitude of the skin marker artefact, the corrections to the marker positions, and the error of the CoR estimations to the known centre of rotation. We found that the equation err=0.5r(s) provides a reliable relationship among the CoR error, err, and the scaled SCoRE residual, r(s), providing that any skin marker artefact is first minimised using the optimal common shape technique (OCST). Measurements on six healthy volunteers showed a reduction of SCoRE residual from 11 to below 6mm and therefore demonstrated consistency of the theoretical considerations and numerical simulations with the in vivo data. This study also demonstrates the significant benefit of the OCST for reducing skin marker artefact and thus for predicting the accuracy of determining joint centre positions in functional gait analysis. For the first time, this understanding of the SCoRE residual allows a measure of error in the non-invasive assessment of joint centres. This measure now enables a rapid assessment of the accuracy of the CoR as well as an estimation of the reproducibility and repeatability of skeletal motion patterns. 相似文献
11.
The most important kinetic models developed for acetic fermentation were evaluated to study their ability to explain the behavior of the industrial process of acetification. Each model was introduced into a simulation environment capable of replicating the conditions of the industrial plant. In this paper, it is proven that these models are not suitable to predict the evolution of the industrial fermentation by the comparison of the simulation results with an average sequence calculated from the industrial data. Therefore, a new kinetic model for the industrial acetic fermentation was developed. The kinetic parameters of the model were optimized by a specifically designed genetic algorithm. Only the representative sequence of industrial concentrations of acetic acid was required. The main novelty of the algorithm is the four-composed desirability function that works properly as the response to maximize. The new model developed is capable of explaining the behavior of the industrial process. The predictive ability of the model has been compared with that of the other models studied. 相似文献
12.
Johanna Velásquez Juliana López-Angarita Juan A. Sánchez 《Biodiversity and Conservation》2011,20(14):3591-3603
Localized and global impacts are responsible for driving the current decline in coral reef ecosystems. The worldwide debate
over the efficacy of Marine Protected Areas (MPAs) as a conservation measure for coral reefs highlights the importance of
acquiring accurate indicators of reef resilience and recovery under stressful conditions. Marine benthic foraminifera, considered
outstanding indicators for environmental changes, are unicellular eukaryotes that inhabit sandy sediments in coral reefs.
There are three kinds of benthic foraminifera: symbiotic, opportunistic, and other small heterotrophic. Symbiosis with microalgae
is not favorable under high nutrient conditions. This study compared the FORAM Index (FI) in different sites inside and outside
MPAs. The index was also compared with coral and algae cover. High FI values are characteristic of healthy oligotrophic reefs
whereas low values represent eutrophicated ecosystems. A community structure analysis was done in order to determine the compositional
change of functional groups, and thus, to test spatially the index efficiency and performance. In general, MPA sites presented
lower indexes compared to Non-MPA sites, probably due to the higher impact of tourism and agriculture in these areas. On the
other hand, the index was not correlated with coral nor algae cover, even though positive and negative trends were found.
Assemblage analyzes corroborated that the symbiotic foraminifers’ high susceptibility is the main source for the index variation,
which was independent of the substrate type from which it was sampled. Our results show both the efficiency of this index
and the importance of its application for evaluating conservation strategies. 相似文献
13.
14.
Conservation planning requires knowledge of the distribution of all species in the area of interest. Surrogates for biodiversity are considered as a possible solution. The two major types are biological and environmental surrogates. Here, we evaluate four different methods of hierarchical clustering, as well as one non-hierarchical method, in the context of producing surrogates for biodiversity. Each clustering method was used to produce maps of both surrogate types. We evaluated the representativeness of each clustering method by finding the average number of species represented in a set of sites, one site of each domain, which was carried out with Monte-Carlo permutations procedure. We propose an additional measure of surrogate performance, which is the degree of evenness of the different domains, e.g., by calculating Simpson's diversity index. Surrogates with low evenness leave little flexibility in site selection since often some of the domains may be represented by a single or very few sites, and thus surrogate maps with a high Simpson's index value may be more relevant for actual decision making. We found that there is a trade-off between species representativeness and evenness. Centroid clustering represented the most species, but had very low values of evenness. Ward's method of minimum variance represented more species than a random choice, and had high evenness values. Using the typical evaluation measures, the Centroid clustering method was most efficient for surrogate production. However, when Simpson's index is also considered, Ward's method of minimum variance is more appropriate for managers. 相似文献
15.
16.
D Farina R Colombo R Merletti H B Olsen 《Journal of electromyography and kinesiology》2001,11(3):175-187
We propose and test a tool to evaluate and compare EMG signal decomposition algorithms. A model for the generation of synthetic intra-muscular EMG signals, previously described, has been used to obtain reference decomposition results. In order to evaluate the performance of decomposition algorithms it is necessary to define indexes which give a compact but complete indication about the quality of the decomposition. The indexes given by traditional detection theory are in this paper adapted to the multi-class EMG problem. Moreover, indexes related to model parameters are also introduced. It is possible in this way to compare the sensitivity of an algorithm to different signal features. An example application of the technique is presented by comparing the results obtained from a set of synthetic signals decomposed by expert operators having no information about the signal features using two different algorithms. The technique seems to be appropriate for evaluating decomposition performance and constitutes a useful tool for EMG signal researchers to identify the algorithm most appropriate for their needs. 相似文献
17.
Narasimha R Aganj I Bennett AE Borgnia MJ Zabransky D Sapiro G McLaughlin SW Milne JL Subramaniam S 《Journal of structural biology》2008,164(1):7-17
Tomograms of biological specimens derived using transmission electron microscopy can be intrinsically noisy due to the use of low electron doses, the presence of a "missing wedge" in most data collection schemes, and inaccuracies arising during 3D volume reconstruction. Before tomograms can be interpreted reliably, for example, by 3D segmentation, it is essential that the data be suitably denoised using procedures that can be individually optimized for specific data sets. Here, we implement a systematic procedure to compare various nonlinear denoising techniques on tomograms recorded at room temperature and at cryogenic temperatures, and establish quantitative criteria to select a denoising approach that is most relevant for a given tomogram. We demonstrate that using an appropriate denoising algorithm facilitates robust segmentation of tomograms of HIV-infected macrophages and Bdellovibrio bacteria obtained from specimens at room and cryogenic temperatures, respectively. We validate this strategy of automated segmentation of optimally denoised tomograms by comparing its performance with manual extraction of key features from the same tomograms. 相似文献
18.
19.
Latman NS Hans P Nicholson L DeLee Zint S Lewis K Shirey A 《Biomedical instrumentation & technology / Association for the Advancement of Medical Instrumentation》2001,35(4):259-265
The purpose of this study was to examine the accuracy and reliability of a wide range of clinical thermometry instruments and technologies. In a historical sense, the purpose of this study was to determine if the improvements in speed, ease of use, and safety realized in the last 100 years have been offset by a loss of accuracy and/or reliability. In view of current events, the purpose was to determine if the new generation of electronic, digital clinical thermometers could be used to replace the traditional glass/mercury thermometers. Nine clinical thermometers representing electronic, digital oral, and predictive oral; electronic, digital infrared tympanic; and liquid crystal urinary technologies were evaluated. Accuracy was determined by comparing the temperatures obtained from these test instruments with those of the reference, glass/mercury oral thermometer. Reliability was determined by test-retest evaluation. All of the thermometers evaluated were significantly less accurate when compared with the reference thermometer in this study. All of the test instruments significantly underestimated higher temperatures and overestimated lower temperatures. This study indicated that the improvements in safety, speed, and ease of use of the newer clinical thermometers have been offset by a loss in accuracy and reliability. It also indicated that the current generation of electronic, digital clinical thermometers, in general, may not be sufficiently accurate or reliable to replace the traditional glass/mercury thermometers. 相似文献
20.
We compared two algorithms, which are used to assess the number of forward saccades in a reading task from records of eye movements. In one algorithm saccades are detected analysing the velocity of eye movements. The third derivate of eye position in time (jerk) is used in the second algorithm for the detection of saccades. Both algorithms were applied on the same set of data, recorded using 24 subjects reading a German text, which was presented on two different displays. Our subjects read the text at a mean reading speed of 258.5 word/min. Both algorithms were found to produce a similar rate of artefacts in the number of detected saccades (2.5%), provided the threshold for detection (velocity or jerk) is set at an appropriate level and the same level of threshold is applied to all data. In both algorithms, the rate of artefacts increases with increasing distance of the threshold from its optimum. Inter-individual variation of the rate of artefacts increases more pronounced in the algorithm based on jerks. Eye blinks were identified as a major source of artefacts. A remedy is proposed, by means of which the rate of artefacts can be reduced. 相似文献