首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Gene recognition by combination of several gene-finding programs   总被引:8,自引:1,他引:7  
MOTIVATION: A number of programs have been developed to predict the eukaryotic gene structures in DNA sequences. However, gene finding is still a challenging problem. RESULTS: We have explored the effectiveness when the results of several gene-finding programs were re- analyzed and combined. We studied several methods with four programs (FEXH, GeneParser3, GEN-SCAN and GRAIL2). By HIGHEST-policy combination method or BOUNDARY method, approximate correlation (AC) improved by 3- 5% in comparison with the best single gene-finding program. From another viewpoint, OR-based combination of the four programs is the most reliable to know whether a candidate exon overlaps with the real exon or not, although it is less sensitive than GENSCAN for exon-intron boundaries. Our methods can easily be extended to combine other programs. AVAILABILITY: We have developed a server program (Shirokane System) and a client program (GeneScope) to use the methods. GeneScope is available through a WWW site (http://gf.genome.ad.jp/). CONTACT: katsu,takagi@ims.u-tokyo.ac.jp   相似文献   

2.
MOTIVATION: There are a large number of computational programs freely available to bioinformaticians via a client/server, web-based environment. However, the client interface to these tools (typically an html form page) cannot be customized from the client side as it is created by the service provider. The form page is usually generic enough to cater for a wide range of users. However, this implies that a user cannot set as 'default' advanced program parameters on the form or even customize the interface to his/her specific requirements or preferences. Currently, there is a lack of end-user interface environments that can be modified by the user when accessing computer programs available on a remote server running on an intranet or over the Internet. RESULTS: We have implemented a client/server system called ORBIT (Online Researcher's Bioinformatics Interface Tools) where individual clients can have interfaces created and customized to command-line-driven, server-side programs. Thus, Internet-based interfaces can be tailored to a user's specific bioinformatic needs. As interfaces are created on the client machine independent of the server, there can be different interfaces to the same server-side program to cater for different parameter settings. The interface customization is relatively quick (between 10 and 60 min) and all client interfaces are integrated into a single modular environment which will run on any computer platform supporting Java. The system has been developed to allow for a number of future enhancements and features. ORBIT represents an important advance in the way researchers gain access to bioinformatics tools on the Internet.  相似文献   

3.
The design of Jemboss: a graphical user interface to EMBOSS   总被引:2,自引:0,他引:2  
DESIGN: Jemboss is a graphical user interface (GUI) for the European Molecular Biology Open Software Suite (EMBOSS). It is being developed at the MRC UK HGMP-RC as part of the EMBOSS project. This paper explains the technical aspects of the Jemboss client-server design. The client-server model optionally allows that a Jemboss user have an account on the remote server. The Jemboss client is written in Java and is downloaded automatically to a user's workstation via Java Web Start using the HTML protocol. The client then communicates with the remote server using SOAP (Simple Object Access Protocol). A Tomcat server listens on the remote machine and communicates the SOAP requests to a Jemboss server, again written in Java. This Java server interprets the client requests and executes them through Java Native Interface (JNI) code written in the C language. Another C program having setuid privilege, jembossctl, is called by the JNI code to perform the client requests under the user's account on the server. The commands include execution of EMBOSS applications, file management and project management tasks. Jemboss allows the use of JSSE for encryption of communication between the client and server. The GUI parses the EMBOSS Ajax Command Definition language for form generation and maximum input flexibility. Jemboss interacts directly with the EMBOSS libraries to allow dynamic generation of application default settings. RESULTS: This interface is part of the EMBOSS distribution and has attracted much interest. It has been set up at many other sites globally as well as being used at the HGMP-RC for registered users. AVAILABILITY: The software, EMBOSS and Jemboss, is freely available to academics and commercial users under the GPL licence. It can be downloaded from the EMBOSS ftp server: http://www.uk.embnet.org/Software/EMBOSS/, ftp://ftp.uk.embnet.org/pub/EMBOSS/. Registered HGMP-RC users can access an installed server from: http://www.uk.embnet.org/Software/EMBOSS/Jemboss/  相似文献   

4.
A number of biological data resources (i.e. databases and data analytical tools) are searchable and usable on-line thanks to the internet and the World Wide Web (WWW) servers. The output from the web server is easy for us to browse. However, it is laborious and sometimes impossible for us to write a computer program that finds a useful data resource, sends a proper query and processes the output. It is a serious obstacle to the integration of distributed heterogeneous data resources. To solve the issue, we have implemented a SOAP (Simple Object Access Protocol) server and web services that provide a program-friendly interface. The web services are accessible at http://www.xml.nig.ac.jp/.  相似文献   

5.
In the genomic era, researchers often want to know more information about a biological sequence by retrieving its related articles. However, there is no available tool yet to achieve conveniently this goal. Here we developed a new literature-mining tool MedBlast, which uses natural language processing techniques, to retrieve the related articles of a given sequence. An online server of this program is also provided. AVAILABILITY: Both online server and the program are available freely at http://medblast.sibsnet.org  相似文献   

6.
The recent accumulation of large amounts of 3D structural data warrants a sensitive and automatic method to compare and classify these structures. We developed a web server for comparing protein 3D structures using the program Matras (http://biunit.aist-nara.ac.jp/matras). An advantage of Matras is its structure similarity score, which is defined as the log-odds of the probabilities, similar to Dayhoff's substitution model of amino acids. This score is designed to detect evolutionarily related (homologous) structural similarities. Our web server has three main services. The first one is a pairwise 3D alignment, which is simply align two structures. A user can assign structures by either inputting PDB codes or by uploading PDB format files in the local machine. The second service is a multiple 3D alignment, which compares several protein structures. This program employs the progressive alignment algorithm, in which pairwise 3D alignments are assembled in the proper order. The third service is a 3D library search, which compares one query structure against a large number of library structures. We hope this server provides useful tools for insights into protein 3D structures.  相似文献   

7.
SUMMARY: PreDs is a WWW server that predicts the dsDNA-binding sites on protein molecular surfaces generated from the atomic coordinates in a PDB format. The prediction was done by evaluating the electrostatic potential, the local curvature and the global curvature on the surfaces. Results of the prediction can be interactively checked with our original surface viewer. AVAILABILITY: PreDs is available free of charge from http://pre-s.protein.osaka-u.ac.jp/~preds/ CONTACT: kino@ims.u-tokyo.ac.jp.  相似文献   

8.
SUMMARY: P-cats is a web server that predicts the catalytic residues in proteins from the atomic coordinates. P-cats receives a coordinate file of the tertiary structure and sends out analytical results via e-mail. The reply contains a summary and two URLs to allow the user to examine the conserved residues: one for interactive images of the prediction results and the other for a graphical view of the multiple sequence alignment. AVAILABILITY: P-cats is freely available at http://p-cats.hgc.jp/p-cats CONTACT: kino@ims.u-tokyo.ac.jp  相似文献   

9.
DNA Data Bank of Japan (DDBJ) for genome scale research in life science   总被引:5,自引:0,他引:5  
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has made an effort to collect as much data as possible mainly from Japanese researchers. The increase rates of the data we collected, annotated and released to the public in the past year are 43% for the number of entries and 52% for the number of bases. The increase rates are accelerated even after the human genome was sequenced, because sequencing technology has been remarkably advanced and simplified, and research in life science has been shifted from the gene scale to the genome scale. In addition, we have developed the Genome Information Broker (GIB, http://gib.genes.nig.ac.jp) that now includes more than 50 complete microbial genome and Arabidopsis genome data. We have also developed a database of the human genome, the Human Genomics Studio (HGS, http://studio.nig.ac.jp). HGS provides one with a set of sequences being as continuous as possible in any one of the 24 chromosomes. Both GIB and HGS have been updated incorporating newly available data and retrieval tools.  相似文献   

10.
VISTA : visualizing global DNA sequence alignments of arbitrary length   总被引:31,自引:0,他引:31  
Summary: VISTA is a program for visualizing global DNA sequence alignments of arbitrary length. It has a clean output, allowing for easy identification of similarity, and is easily configurable, enabling the visualization of alignments of various lengths at different levels of resolution. It is currently available on the web, thus allowing for easy access by all researchers. Availability: VISTA server is available on the web at http://www-gsd.lbl.gov/vista. The source code is available upon request. Contact: vista@lbl.gov  相似文献   

11.
MOTIVATION: The human genome project and the development of new high-throughput technologies have created unparalleled opportunities to study the mechanism of diseases, monitor the disease progression and evaluate effective therapies. Gene expression profiling is a critical tool to accomplish these goals. The use of nucleic acid microarrays to assess the gene expression of thousands of genes simultaneously has seen phenomenal growth over the past five years. Although commercial sources of microarrays exist, investigators wanting more flexibility in the genes represented on the array will turn to in-house production. The creation and use of cDNA microarrays is a complicated process that generates an enormous amount of information. Effective data management of this information is essential to efficiently access, analyze, troubleshoot and evaluate the microarray experiments. RESULTS: We have developed a distributable software package designed to track and store the various pieces of data generated by a cDNA microarray facility. This includes the clone collection storage data, annotation data, workflow queues, microarray data, data repositories, sample submission information, and project/investigator information. This application was designed using a 3-tier client server model. The data access layer (1st tier) contains the relational database system tuned to support a large number of transactions. The data services layer (2nd tier) is a distributed COM server with full database transaction support. The application layer (3rd tier) is an internet based user interface that contains both client and server side code for dynamic interactions with the user. AVAILABILITY: This software is freely available to academic institutions and non-profit organizations at http://www.genomics.mcg.edu/niddkbtc.  相似文献   

12.
MOTIVATION: Since their initial development, integration and construction of databases for molecular-level data have progressed. Though biological molecules are related to each other and form a complex system, the information is stored in the vast archives of the literature or in diverse databases. There is no unified naming convention for biological object, and biological terms may be ambiguous or polysemic. This makes the integration and interaction of databases difficult. In order to eliminate these problems, machine-readable natural language resources appear to be quite promising. We have developed a workbench for protein name abbreviation dictionary (PNAD) building. RESULTS: We have developed PNAD Construction Support System (PNAD-CSS), which offers various convenient facilities to decrease the construction costs of a protein name abbreviation dictionary of which entries are collected from abstracts in biomedical papers. The system allows the users to concentrate on higher level interpretation by removing some troublesome tasks, e.g. management of abstracts, extracting protein names and their abbreviations, and so on. To extract a pair of protein names and abbreviations, we have developed a hybrid system composed of the PROPER System and the PNAD System. The PNAD System can extract the pairs from parenthetical-paraphrases involved in protein names, the PROPER System identified these paris, with 98.95% precision, 95.56% recall and 97.58% complete precision. AVAILABILITY: PROPER System is freely available from http://www.hgc.inc.u-tokyo.ac.jp/service/tooldoc /KeX/intro.html. The other software are also available on request. Contact the authors. CONTACT: mikio@ims.u-tokyo.ac.jp  相似文献   

13.
ABSTRACT: BACKGROUND: Laboratories engaged in computational biology or bioinformatics frequently need to run lengthy, multistep, and user-driven computational jobs. Each job can tie up a computer for a few minutes to several days, and many laboratories lack the expertise or resources to build and maintain a dedicated computer cluster. RESULTS: JobCenter is a client-server application and framework for job management and distributed job execution. The client and server components are both written in Java and are cross-platform and relatively easy to install. All communication with the server is client-driven, which allows worker nodes to run anywhere (even behind external firewalls) and provides inherent load balancing. Adding a worker node to the worker pool is as simple as dropping the JobCenter client files onto any computer and performing basic configuration, providing tremendous ease-of-use, flexibility, and limitless horizontal scalability. Each worker installation may be independently configured, including the types of jobs it is able to run. Executed jobs may be written in any language and may include multiple execution steps. CONCLUSIONS: JobCenter is a versatile and scalable distributed job management system that allows laboratories to very efficiently distribute all computational work among all available resources. JobCenter is freely available at http://code.google.com/p/jobcenter/.  相似文献   

14.
ToolShop: prerelease inspections for protein structure prediction servers.   总被引:2,自引:0,他引:2  
The ToolShop server offers a possibility to compare a protein tertiary structure prediction server with other popular servers before releasing it to the public. The comparison is conducted on a set of 203 proteins and the collected models are compared with over 20 other programs using various assessment procedures. The evaluation lasts circa one week. AVAILABILITY: The ToolShop server is available at http://BioInfo.PL/ToolShop/. The administrator should be contacted to couple the tested server to the evaluation suite. CONTACT: leszek@bioinfo.pl SUPPLEMENTARY INFORMATION: The evaluation procedures are similar to those implemented in the continuous online server evaluation program, LiveBench. Additional information is available from its homepage (http://BioInfo.PL/LiveBench/).  相似文献   

15.
16.
The analysis of genetic data often requires a combination of several approaches using different and sometimes incompatible programs. In order to facilitate data exchange and file conversions between population genetics programs, we introduce PGDSpider, a Java program that can read 27 different file formats and export data into 29, partially overlapping, other file formats. The PGDSpider package includes both an intuitive graphical user interface and a command-line version allowing its integration in complex data analysis pipelines. AVAILABILITY: PGDSpider is freely available under the BSD 3-Clause license on http://cmpg.unibe.ch/software/PGDSpider/.  相似文献   

17.
Glaucoma is the second leading cause of blindness after cataract and is heterogeneous in nature. Employing a genetic approach for the detection of the diseased condition provides an advantage that the gene responsible for the disease can be identified by genetic test. The availability of predictive tests based on the published literature would provide a mechanism for early detection and treatment. The genotype and phenotype information could be a valuable source for predicting the risk of the disease. To this end, a web server has been developed, based on the genotype and phenotype of myocilin mutation, which were identified by familial linkage analysis and case studies. The proposed web server provides clinical data and severity index for a given mutation. The server has several useful options to help clinicians and researchers to identify individuals at a risk of developing the disease. Glaucoma Pred server is available at http://bioserver1.physics.iisc.ac.in/myocilin.  相似文献   

18.
SNAPper is a network service for predicting gene function based on the conservation of gene order. AVAILABILITY: The SNAPper server is available at http://pedant.gsf.de/snapper. SNAPper-based functional predictions will soon be offered as part of the PEDANT genome analysis server http://pedant.gsf.de.  相似文献   

19.
ABSTRACT: BACKGROUND: Understanding protein subcellular localization is a necessary component toward understanding the overall function of a protein. Numerous computational methods have been published over the past decade, with varying degrees of success. Despite the large number of published methods in this area, only a small fraction of them are available for researchers to use in their own studies. Of those that are available, many are limited by predicting only a small number of major organelles in the cell. Additionally, the majority of methods predict only a single location, even though it is known that a large fraction of the proteins in eukaryotic species shuttle between locations to carry out their function. FINDINGS: We present a software package and a web server for predicting subcellular localization of protein sequences based on the ngLOC method. ngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 89.8% to 91.4% across species. This program can predict 11 distinct locations each in plant and animal species. ngLOC also predicts 4 and 5 distinct locations on gram-positive and gram-negative bacterial datasets, respectively. CONCLUSIONS: ngLOC is a generic method that can be trained by data from a variety of species or classes for predicting protein subcellular localization. The standalone software is freely available for academic use under GNU GPL, and the ngLOC web server is also accessible at http://ngloc.unmc.edu.  相似文献   

20.
CONSEL: for assessing the confidence of phylogenetic tree selection.   总被引:10,自引:0,他引:10  
CONSEL is a program to assess the confidence of the tree selection by giving the p-values for the trees. The main thrust of the program is to calculate the p-value of the Approximately Unbiased (AU) test using the multi-scale bootstrap technique. This p-value is less biased than the other conventional p-values such as the Bootstrap Probability (BP), the Kishino-Hasegawa (KH) test, the Shimodaira-Hasegawa (SH) test, and the Weighted Shimodaira-Hasegawa (WSH) test. CONSEL calculates all these p-values from the output of the phylogeny program packages such as Molphy, PAML, and PAUP*. Furthermore, CONSEL is applicable to a wide class of problems where the BPs are available. AVAILABILITY: The programs are written in C language. The source code for Unix and the executable binary for DOS are found at http://www.ism.ac.jp/~shimo/ CONTACT: shimo@ism.ac.jp  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号