首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have written a computer program, BIGPROBE, which facilitates the design of long nucleic acid probes from the partial or complete amino acid sequence of a protein. BIGPROBE relies upon information on codon usage, intercodon dinucleotide frequency, and potential probe self-complementarity. We have examined the accuracy with which the program predicts coding sequences using sample human and rat genes and probe lengths of 30-60 nucleotides. Rat probe sequences selected by BIGPROBE using either codon usage or dinucleotide frequency data alone averaged 86-92% homology with the known exons of the corresponding gene sequences. Predictive accuracy with rat gene probes could be improved to 89-94%, depending upon probe length, by applying codon usage and dinucleotide frequency data in combination. Similar accuracy was achieved for human genes.  相似文献   

2.
A computer program (PINCERS) is described for use in the design of synthetic genes and mixed-probe DNA sequences. A protein sequence is reverse translated with generation of synonymous codons at each position producing a degenerate sequence. In order to locate potential restriction enzyme sites, the degenerate sequence is searched with a library of restriction enzymes for sites that utilize any combination of synonymous codons. These sites are indicated in a map so that they may be incorporated into the synthetic gene sequence. The program allows the user to select the appropriate codon usage table for the organism of interest and then to set a threshold usage frequency below which codons are not generated. PINCERS may also be used to assist in planning the synthesis of mixed-probe DNA sequences for cross-hybridization experiments. It can identify regions of specified length with the protein sequence that have the least overall degeneracy, thereby minimizing the number of probes to be synthesized and, therefore, maximizing the concentration of a given probe sequence.  相似文献   

3.
本文介绍了一个在微机(IBM PC)上实现的、用于核酸顺序分析的计算机程序系统.该系统由三个层次和18个功能块构成,菜单及人机对话使得用户能较快地掌握和使用它.在编程中,采用了树结构、先进后出栈和稀疏矩阵等数据结构技巧,运用了Bayes法等统计分析方法,Kruskal算法和Floyd算法等一系列图论方法也被得到应用,这个软件系统的推出对于分子生物学研究具有一定的积极作用.  相似文献   

4.
The synthesis of complete genes is becoming a more and more popular approach in heterologous gene expression. Reasons for this are the decreasing prices and the numerous advantages in comparison to classic molecular cloning methods. Two of these advantages are the possibility to adapt the codon usage to the host organism and the option to introduce restriction enzyme target sites of choice. C.U.R.R.F. (Codon Usage regarding Restriction Finder) is a free Java(?)-based software program which is able to detect possible restriction sites in both coding and non-coding DNA sequences by introducing multiple silent or non-silent mutations, respectively. The deviation of an alternative sequence containing a desired restriction motive from the sequence with the optimal codon usage is considered during the search of potential restriction sites in coding DNA and mRNA sequences as well as protein sequences. C.U.R.R.F is available at http://www.zvm.tu-dresden.de/die_tu_dresden/fakultaeten/fakultaet_mathematik_und_naturwissenschaften/fachrichtung_biologie/mikrobiologie/allgemeine_mikrobiologie/currf .  相似文献   

5.
基因的表达水平受到起始密码子下游区域AT含量的影响,从巨大的序列集中筛选出具有特定AT含量和密码子用法特征的同义序列是一个繁琐的工作。本文研发AT含量优化工具"BestAT",初步解决了自动获取海量同义序列和充分展示同义序列的密码子用法特性两个关键问题,并且实现了与密码子用法数据库(CUD)的无缝结合,采用了密码子参数的原位标示和AT含量曲线等直观方式展示序列特性,为这类实验设计提供有力的支持。  相似文献   

6.
The design of synthetic genes   总被引:1,自引:1,他引:0  
Computer programs are described that aid in the design of synthetic genes coding for proteins that are targets of a research program in site directed mutagenesis. These programs "reverse-translate" protein sequences into general nucleic acid sequences (those where codons have not yet been selected), map restriction sites into general DNA sequences, identify points in the synthetic gene where unique restriction sites can be introduced, and assist in the design of genes coding for hybrids and evolutionary intermediates between homologous proteins. Application of these programs therefore facilitates the use of modular mutagenesis to create variants of proteins, and the implementation of evolutionary guidance as a strategy for selecting mutants.  相似文献   

7.
We describe a set of IBM-compatible computer programs designed to selectively identify the potential sites for silent mutagenesis within a target DNA sequence. This program is based on a novel strategy of identifying amino acid motifs compatible with each restriction site (BioTechniques 12:382-384, 1991). The programs can be used to identify the suitability for the introduction of any 6-base nucleic acid sequences, such as restriction enzyme sites in cassette mutagenesis strategies. The Table program generates a table of multiple amino acid motifs for each restriction enzyme, obtained by translating each unique recognition sequence in all three reading frames. The Silmut program, which utilizes the features of Table, will further identify the presence of a match between any amino acid motif of each restriction enzyme and the input target sequence. Minor manipulations of the data base files will enable the individual researcher to identify the potential for introduction of any 6-base sequences by silent mutagenesis.  相似文献   

8.
A flexible new computer program for handling DNA sequence data.   总被引:9,自引:2,他引:7       下载免费PDF全文
A compact new computer program for handling nucleic acid sequence data is presented. It consists of a number of different subsets, which may be used according to a given code system. The program is designed for the determination of restriction enzyme and other recognition sites in correlation with translation patterns, and allows tabulation of codon frequencies and protein molecular weights within specified gene boundaries. The program is especially designed for detection of overlapping genes. The language, is FORTRAN and thus the program may be used on small computers; it may also be used without any prior computer experience. Copies are available on request.  相似文献   

9.
Computer programs that can be used for the design of syntheticgenes and that are run on an Apple Macintosh computer are described.These programs determine nucleic acid sequences encoding aminoacid sequences. They select DNA sequences based on codon usageas specified by the user, and determine the placement of basechanges that can be used to create restriction enzyme siteswithout altering the amino acid sequence. A new algorithm forfinding restriction sites by translating the restriction endonucleasetarget sequence in all three reading frames and then searchingthe given peptide or protein amino acid sequence with theseshort restriction enzyme peptide sequences is described. Examplesare given for the creation of synthetic DNA sequences for thebovine prethrombin-2 and ribonuclease A genes Received on October 18, 1988; accepted on December 9, 1988  相似文献   

10.
A program is described to perform general DNA sequence analysis on the Hewlett-Packard Model 86/87 microcomputer operating on 128 K of RAM. The following analytical procedures can be performed: 1. display of the sequence, in whole or part, or its complement; 2. search for specified sequences e.g. restriction sites, and in the case of the latter give fragment sizes; 3. perform a comprehensive search for all known restriction enzyme sites; 4. map sites graphically; 5. perform editing functions; 6. base frequency analysis; 7. search for repeated sequences; 8. search for open reading frames or translate into the amino acid sequence and analyse for basic and acidic amino acids, hydrophobicity, and codon usage. Two sequences, or parts thereof, can be merged in various orientations to mimic recombination strategies, or can be compared for homologies. The program is written in HP BASIC and is designed principally as a tool for the laboratory investigator manipulating a defined set of vectors and recombinant DNA constructs.  相似文献   

11.
New bioactive proteins need to be screened from various microorganismsfor the increasing need for industrial and pharmaceutical peptide,proteins, or enzymes. A novel polymerase chain reaction (PCR)method, restriction site-dependent PCR (RSD-PCR), was designedfor rapid new genes cloning from genomic DNA. RSD-PCR strategyis based on these principles: (i) restriction sites dispersethroughout genomes are candidacy for universal pairing; (ii)a universal primer is a combination of a 3'-end of selectedrestriction sites, and a 5'-end of degenerated sequence. A two-roundPCR protocol was designed and optimized for the RSD-PCR: amplifythe single strand target template from genomic DNA by a specificprimer and amplify the target gene by using the specific primerand one of the universal RSD-primers. The optimized RSD-PCRwas successfully applied in chromosome walking using specificinternal primers, and cloning of new genes using degeneratedprimers derived from NH2-terminal amino acid sequence of protein.  相似文献   

12.
13.
Summary The nucleic acid sequences coding for 23 H3 histone genes from a variety of species have been analyzed using a computer assisted alignment and analysis program. Although these histones are highly conserved within and between highly divergent species, they represent various classes of histones whose patterns of expression are distinctively regulated. Surprisingly, in dendrograms derived from these comparisons, H3 sequences cluster according to their modes of regulation rather than phylogenetically. These clusters are generated from highly distinctive patterns of codon usage within the functional gene classes. We suggest that one factor involved in specifying the differing codon usage patterns between functional classes is a difference in requirements for rapid translation of mRNA. In addition, the data presented here, together with structural and sequence information, suggest a heterodox evolutionary model in which genes related to the intron-bearing, basally expressed H3.3 vertebrate genes are the ancestors of the intronless H3. 1 class of genes of higher eukaryotes. The H3. 1 class must have arisen, therefore, following duplication of a primitive H3.3 gene, but prior to the plant-animal divergence. Implications of the data presented are discussed with regard to functional and evolutionary relationships.  相似文献   

14.
A novel bias in codon third-letter usage was found in Escherichia coli genes with low fractions of "optimal codons", by comparing intact sequences with control random sequences. Third-letter usage has been found to be biased according to preference in codon usage and to doublet preference from the following first letter. The present study examines third-letter usage in the context of the nucleotide sequence when these preferences are considered. In order to exclude any influence by these factors, the random sequences were generated such that the amino acid sequence, codon usage, and the doublet frequency in each gene were all preserved. Comparison of intact sequences with these randomly generated sequences reveals that third letters of codons show a strong preference for the purine/pyrimidine pattern of the next codons: purine (R) is preferred to pyrimidine (Y) at the third site when followed by an R-Y-R codon, and pyrimidine is preferred when followed by an R-R-Y, an R-Y-Y or a Y-R-Y codon. This bias is probably related to interactions of tRNA molecules in the ribosome.  相似文献   

15.
Codon usage in bacteria: correlation with gene expressivity   总被引:153,自引:53,他引:100       下载免费PDF全文
The nucleic acid sequence bank now contains over 600 protein coding genes of which 107 are from prokaryotic organisms. Codon frequencies in each new prokaryotic gene are given. Analysis of genetic code usage in the 83 sequenced genes of the Escherichia coli genome (chromosome, transposons and plasmids) is presented, taking into account new data on gene expressivity and regulation as well as iso-tRNA specificity and cellular concentration. The codon composition of each gene is summarized using two indexes: one is based on the differential usage of iso-tRNA species during gene translation, the other on choice between Cytosine and Uracil for third base. A strong relationship between codon composition and mRNA expressivity is confirmed, even for genes transcribed in the same operon. The influence of codon use of peptide elongation rate and protein yield is discussed. Finally, the evolutionary aspect of codon selection in mRNA sequences is studied.  相似文献   

16.
DIGICALC is a program designed to aid in the acquisition, storage, and analysis of nucleic acid restriction fragment data. The chief considerations during program design were (i) ease of use for people with varying degrees of computer experience, (ii) minimal hardware requirements (e.g. an IBM PC), (iii) portability and ease of modification, and (iv) improved functionality in sizing and comparing restriction fragments over manual methods. The program accepts manual or digitizer input of nucleic acid fragment mobility, calculates the fragments' sizes, and provides the means to search the fragment database and to produce charts of fragment sizes.  相似文献   

17.
In this study, we introduce a novel bioinformatics program, Spore-associated Symbiotic Microbes Position-specific Function (SeSaMe PS Function), for position-specific functional analysis of short sequences derived from metagenome sequencing data of the arbuscular mycorrhizal fungi. The unique advantage of the program lies in databases created based on genus-specific sequence properties derived from protein secondary structure, namely amino acid usages, codon usages, and codon contexts of 3-codon DNA 9-mers. SeSaMe PS Function searches a query sequence against reference sequence database, identifies 3-codon DNA 9-mers with structural roles, and creates a comparative dataset containing the codon usage biases of the 3-codon DNA 9-mers from 54 bacterial and fungal genera. The program applies correlation principal component analysis in conjunction with K-means clustering method to the comparative dataset. 3-codon DNA 9-mers clustered as a sole member or with only a few members are often structurally and functionally distinctive sites that provide useful insights into important molecular interactions. The program provides a versatile means for studying functions of short sequences from metagenome sequencing and has a wide spectrum of applications. SeSaMe PS Function is freely accessible at www.fungalsesame.org.  相似文献   

18.
The DNA sequence orgainzation of the protein encoding region of the gene for silk fibroin has been analyzed. The accompanying paper (Manningm R. F., and Gage, L. P. (1980) J. Biol. Chem. 255, 9451-9457) shows that the total length of the gene, and its protein, as well as the pattern of restriction sites in the gene is highly polymorphic among inbred stocks of Bombyx mori, In this paper, those features of fibroin gene structure which are invariant among these alleles are presented. Fibroin is composed primarily of relatively short "crystalline" and "amorphous" peptides of known sequence whose arrangement in the protein is unknown. Knowledge of the codons most commonly used in fibroin mRNA allowed utilization of particular restriction inzymes as a means for determing the nature and organization of crystalline and amorphous coding sequences in the fibroin gene. Three restriction endonucleases were identified that cleve sequences coding for amorphous region peptides. Their cleavage pattern revelaed that the repetitive coding sequence of the gene core (approximately 15 kilobases) is divided into at least 10 large crystalline coding domains interrupted by smaller amorphous coding domains. Many restriction endoncleases do not cleave the fibroin core at all, three of them with four gase recognition sequences. Specific deductions as to codon usage and repetitive sequence homogeneity in the gene follow from these results. One novel finding is the rigorous exclusion of the glycine codon GGA prior to serine codons even though this glycine codon is used frequently prior to alanine codons. The sequence homogeneity and the regularly alternating arrangement of crystalline and amorphous coding sequences of the gene are discussed in terms of the function of fibroin protein and the evolution of highly repetitive DNA.  相似文献   

19.
The nucleotide sequence of the ppc gene, the structural gene for phosphoenolpyruvate carboxylase [EC 4.1.1.31], of Escherichia coli K-12 was determined. The gene codes for a polypeptide comprising 883 amino acid residues with a calculated molecular weight of 99,061. The amino acid sequence deduced from the nucleotide sequence was entirely consistent with the protein chemical data obtained with the purified enzyme, including the NH2- and COOH-terminal sequences and amino acid composition. The coding region is preceded by two putative ribosome binding sites, and is followed closely by a good representative of rho-independent terminator. The codon usage in the ppc gene suggests a moderate expression of the gene. The secondary structure of the enzyme was predicted from the deduced amino acid sequence.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号