首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Comparison of several protein phylogeny reconstruction methods was realized on a set of natural protein sequences. The programs of the PHYLIP package and FastME, PhyML and TreeTop programs were tested. In contrast to several studied programs that used simulated sequences, our results demonstrate the superiority of distance methods over the maximum likelihood method.  相似文献   

2.
Evaluation of Gene Structure Prediction Programs   总被引:2,自引:0,他引:2  
We evaluate a number of computer programs designed to predict the structure of protein coding genes in genomic DNA sequences. Computational gene identification is set to play an increasingly important role in the development of the genome projects, as emphasis turns from mapping to large-scale sequencing. The evaluation presented here serves both to assess the current status of the problem and to identify the most promising approaches to ensure further progress. The programs analyzed were uniformly tested on a large set of vertebrate sequences with simple gene structure, and several measures of predictive accuracy were computed at the nucleotide, exon, and protein product levels. The results indicated that the predictive accuracy of the programs analyzed was lower than originally found. The accuracy was even lower when considering only those sequences that had recently been entered and that did not show any similarity to previously entered sequences. This indicates that the programs are overly dependent on the particularities of the examples they learn from. For most of the programs, accuracy in this test set ranged from 0.60 to 0.70 as measured by the Correlation Coefficient (where 1.0 corresponds to a perfect prediction and 0.0 is the value expected for a random prediction), and the average percentage of exons exactly identified was less than 50%. Only those programs including protein sequence database searches showed substantially greater accuracy. The accuracy of the programs was severely affected by relatively high rates of sequence errors. Since the set on which the programs were tested included only relatively short sequences with simple gene structure, the accuracy of the programs is likely to be even lower when used for large uncharacterized genomic sequences with complex structure. While in such cases, programs currently available may still be of great use in pinpointing the regions likely to contain exons, they are far from being powerful enough to elucidate its genomic structure completely.  相似文献   

3.
MOTIVATION: The programs currently available for the analysis of nucleic acid and protein sequences suffer from a variety of problems: Web-based programs often require inconvenient reformatting of sequences when proceeding from one analysis to the next, and commercial-console-based programs are cost prohibitive. Here, we report the development of DNASSIST:, an inexpensive, multiple-document, interface program for the fully integrated editing and analysis of nucleic acid and protein sequences in the familiar environment of Microsoft Windows.  相似文献   

4.
Two programs written in BASIC are used for the teaching of protein synthesis. Students may individually and at their own pace test their knowledge of base pairing using one of the programs. Again individually, students then investigate the process of protein synthesis using randomly generated DNA sequences produced by the second program. Students are thereby reinforcing their understanding of base pairing, complementary sequences, triplet codons, and relating nucleotide sequences to amino acid sequences. The final part of the second exercise introduces a known genetic defect of man.  相似文献   

5.
The design of synthetic genes   总被引:1,自引:1,他引:0       下载免费PDF全文
Computer programs are described that aid in the design of synthetic genes coding for proteins that are targets of a research program in site directed mutagenesis. These programs "reverse-translate" protein sequences into general nucleic acid sequences (those where codons have not yet been selected), map restriction sites into general DNA sequences, identify points in the synthetic gene where unique restriction sites can be introduced, and assist in the design of genes coding for hybrids and evolutionary intermediates between homologous proteins. Application of these programs therefore facilitates the use of modular mutagenesis to create variants of proteins, and the implementation of evolutionary guidance as a strategy for selecting mutants.  相似文献   

6.
Guo JT  Jaromczyk JW  Xu Y 《Proteins》2007,67(3):548-558
Chameleon sequences have been implicated in amyloid related diseases. Here we report an analysis of two types of chameleon sequences, chameleon-HS (Helix vs. Strand) and chameleon-HE (Helix vs. Sheet), based on known structures in Protein Data Bank. Our survey shows that the longest chameleon-HS is eight residues while the longest chameleon-HE is seven residues. We have done a detailed analysis on the local and global environment that might contribute to the unique conformation of a chameleon sequence. We found that the existence of chameleon sequences does not present a problem for secondary structure prediction programs, including the first generation prediction programs, such as Chou-Fasman algorithm, and the third generation prediction programs that utilize evolution information. We have also investigated the possible implication of chameleon sequences in structural conservation and functional diversity of alternatively spliced protein isoforms.  相似文献   

7.
Computer programs that can be used for the design of syntheticgenes and that are run on an Apple Macintosh computer are described.These programs determine nucleic acid sequences encoding aminoacid sequences. They select DNA sequences based on codon usageas specified by the user, and determine the placement of basechanges that can be used to create restriction enzyme siteswithout altering the amino acid sequence. A new algorithm forfinding restriction sites by translating the restriction endonucleasetarget sequence in all three reading frames and then searchingthe given peptide or protein amino acid sequence with theseshort restriction enzyme peptide sequences is described. Examplesare given for the creation of synthetic DNA sequences for thebovine prethrombin-2 and ribonuclease A genes Received on October 18, 1988; accepted on December 9, 1988  相似文献   

8.
Shaw G 《BioTechniques》2000,28(6):1198-1201
Biologists today make extensive use of word processing programs for the production of research reports, literature reviews and grant proposals. Frequently, such programs become the default platform for viewing and the later publication of protein and nucleic acid sequence data. Thus, researchers often switch between their word processor and more specialized programs designed to analyze protein and nucleic acid sequences. It would be more convenient to perform these simple sequence analyses using the word processor without switching to another program. The focus here is on the use of the Visual Basic programming language, which is built into all recent versions of Microsoft Word to generate surprisingly complex and useful macros that can conveniently analyze several important features of protein and nucleic acid sequences. The standard Word interface can also be easily modified to display and run these macros from a pull-down menu. Several examples of this approach are provided.  相似文献   

9.
Apple Macintosh programs for nucleic and protein sequence analyses   总被引:4,自引:1,他引:3  
This paper describes a package of programs for handling and analyzing nucleic acid and protein sequences using the Apple Macintosh microcomputer. There are three important features of these programs: first, because of the now classical Macintosh interface the programs can be easily used by persons with little or no computer experience. Second, it is possible to save all the data, written in an editable scrolling text window or drawn in a graphic window, as files that can be directly used either as word processing documents or as picture documents. Third, sequences can be easily exchanged with any other computer. The package is composed of thirteen programs, written in Pascal programming language.  相似文献   

10.
11.
We developed novel programs for displaying and analyzing the transmembrane alpha-helical segments (TMSs) in the aligned sequences of homologous integral membrane proteins. TMS_ALIGN predicts the positions of putative TMSs in multiply aligned protein sequences and graphically shows the TMSs in the alignment. TMS_SPLIT (1). predicts the positions of TMSs for each sequence; (2). allows a user to select proteins with a specified number of TMSs, and (3). splits the sequences into groups of TMSs of equal numbers. TMS_CUT works like TMS_SPLIT, but it can cut sequences with any combination of TMSs. The BASS program similarly allows comparison of protein repeat elements, equivalent to TMS_SPLIT plus IC, but it provides the comparison data expressed in BLAST E values. These programs, together with the IntraCompare program, facilitate the identification of repeat sequences in integral membrane proteins. They also facilitate the estimation of protein topology and the determination of evolutionary pathways.  相似文献   

12.
The large number of protein consensus sequences that may be recognized without computer analysis are reviewed. These include the extensive range of known phosphorylation site motifs for protein kinases; metal binding sites for calcium, zinc, copper, and iron; enzyme active site motifs; nucleotide binding and covalent attachment sites for prosthetic groups, carbohydrate, and lipids. Of particular note is the increasing realization of the importance for cellular regulation of protein-protein interaction motifs and sequences that target proteins to particular subcellular locations. This article includes an introduction to accessing the many suites of programs for analysis of protein structure, signatures of protein families, and consensus sequences that may be carried out on the internet.  相似文献   

13.
MOTIVATION: In 2001 and 2002, we published two papers (Bioinformatics, 17, 282-283, Bioinformatics, 18, 77-82) describing an ultrafast protein sequence clustering program called cd-hit. This program can efficiently cluster a huge protein database with millions of sequences. However, the applications of the underlying algorithm are not limited to only protein sequences clustering, here we present several new programs using the same algorithm including cd-hit-2d, cd-hit-est and cd-hit-est-2d. Cd-hit-2d compares two protein datasets and reports similar matches between them; cd-hit-est clusters a DNA/RNA sequence database and cd-hit-est-2d compares two nucleotide datasets. All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST.  相似文献   

14.

Background  

There have been many algorithms and software programs implemented for the inference of multiple sequence alignments of protein and DNA sequences. The "true" alignment is usually unknown due to the incomplete knowledge of the evolutionary history of the sequences, making it difficult to gauge the relative accuracy of the programs.  相似文献   

15.
In this paper, we introduce a new Graphical User Interface that estimates evolutionary rates on protein sequences by assessing changes in biochemical constraints. We describe IMPACT, a platform-independent (tested in Linux, Windows, and MacOS), easy to install software written in Java. IMPACT integrates the use of a built-in multiple sequence alignment editor, with programs that perform phylogenetic and protein structure analyses (ConTest, PhyML, ATV, and Jmol) allowing the user to quickly and efficiently perform evolutionary analyses on protein sequences, including the detection of selection (negative and positive) signatures at the amino acid scale, which can provide fundamental insight about species evolution and ecological fitness. IMPACT provides the user with a working platform that combines a number of bioinformatics tools and utilities in one place, transferring information directly among the various programs and therefore increasing the overall performance of evolutionary analyses on proteins.  相似文献   

16.
Each amino acid in a protein is considered to be an individual, mutable characteristic of the species from which the protein is extracted. For a branching tree representing the evolutionary history of the known sequences in different species, our computer programs use majority logic and parsimony of mutations to determine the most likely ancestral amino acid for each position of the protein at each node of the tree. The number of mutations necessary between the ancestral and present species is summed for each branch and the entire tree. The programs then move branches to make many different configurations, from which we select the one with the minimum number of mutations as the most likely evolutionary history. We used this method to elucidate primate phylogeny from sequences of fibrinopeptides, carbonic anhydrase, and the hemoglobin beta, delta and alpha chains. All available sequences indicate that the early Pongidae had diverged into two lines before the divergence of an ancestor for the human line alone. We have constructed some probable ancestral sequences at major points during primate evolution and have developed tentative trees showing the order of divergences and evolutionary distances among primate groups. Further questions on primate evolution could be answered in the future by the detemination of the appropriate sequences.  相似文献   

17.
Simple flexible programs (TREEMOMENT and PILEUPMOMENT) are described for depicting the average amphipathicity (hydrophobic moment) along multiply aligned sequences of a family of evolutionarily related proteins. The programs are applicable to any number of aligned sequences and can be set for any desired angle corresponding to a residue repeat unit in a protein secondary structural element such as 100 per residue for an alpha- helix or 180 per residue for a beta-strand. These programs can be used to identify amphipathic regions common to the members of a protein family. The use of these programs is exemplified by showing that some families of integral membrane transport proteins (i.e. permeases of the bacterial phosphotransferase system (PTS) and the anion exchangers of animals) exhibit strikingly amphipathic alpha-helical structures immediately preceding the first hydrophobic transmembrane segment of their membrane-embedded domain(s). Other families, such as the major facilitator superfamily of uniporters, symporters and antiporters, do not exhibit this structural feature. The amphipathic structures in PTS permeases have been implicated in membrane insertion during biogenesis.  相似文献   

18.
Simple flexible programs (TREEMOMENT and PILEUPMOMENT) are described for depicting the average amphipathicity (hydrophobic moment) along multiply aligned sequences of a family of evolutionarily related proteins. The programs are applicable to any number of aligned sequences and can be set for any desired angle corresponding to a residue repeat unit in a protein secondary structural element such as 100 degrees per residue for an alpha-helix or 180 degrees per residue for a beta-strand. These programs can be used to identify amphipathic regions common to the members of a protein family. The use of these programs is exemplified by showing that some families of integral membrane transport proteins (i.e. permeases of the bacterial phosphotransferase system (PTS) and the anion exchangers of animals) exhibit strikingly amphipathic alpha-helical structures immediately preceding the first hydrophobic transmembrane segment of their membrane-embedded domain(s). Other families, such as the major facilitator superfamily of uniporters, symporters and antiporters, do not exhibit this structural feature. The amphipathic structures in PTS permeases have been implicated in membrane insertion during biogenesis.  相似文献   

19.
Several primer prediction and analysis programs have been developed for diverse applications. However, none of these existing programs can be directly used for the design of primers in protein interaction experiments, since proteins may have transmembrane domains (TMDs) and/or a signal peptide that must be excluded from experiments. Furthermore, it is frequently the case that a short restriction sequences must be added to each primer in order to clone PCR products into a given destination vectors for expression. DePIE, a web-based primer design tool, was developed to address these deficiencies. The program takes as input NCBI protein accession numbers and returns primer information including nucleotide sequences, thermodynamic melting temperature of the nucleotide sequences and the target positions. DePIE is implemented in JAVA, PERL and PHP and has proven to be very efficient in designing primers for our interaction experiments. DePIE services can be accessed at the web site: http://biocore.unl.edu/primer/primerPI.html.  相似文献   

20.
本文介绍了一个在微机(IBM PC)上实现的、用于核酸顺序分析的计算机程序系统.该系统由三个层次和18个功能块构成,菜单及人机对话使得用户能较快地掌握和使用它.在编程中,采用了树结构、先进后出栈和稀疏矩阵等数据结构技巧,运用了Bayes法等统计分析方法,Kruskal算法和Floyd算法等一系列图论方法也被得到应用,这个软件系统的推出对于分子生物学研究具有一定的积极作用.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号