首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The codon usage in individual protein genes has been calculated using the nucleotide sequence obtained from the GenBank Genetic Sequence Database. Sum of the codon use of each organism has been also calculated. The data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of codon usage of genes in organisms was made searchable by name of organism through a web site. The compilation has been synchronized with a major release of GenBank.  相似文献   

2.
Codon usage in 87 602 genes has been calculated using the nucleotide sequence data obtained from the GenBank Genetic Sequence Data Bank (Release 90.0; September 1995). The database is called the CUTG Database; the complete form of the database can be obtained by anonymous ftp from DDBJ and a part of the database, which lists the frequency of codon use in each organism, is made searchable through our World Wide Web server.  相似文献   

3.
CUTG (codon usage tabulated from GenBank) is a comprehensive database for codon usage. The codon usage for each full-length protein gene has been calculated using the nucleotide sequence obtained from GenBank sequence database. The sum of the codon use of each organism has been also calculated. The data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of codonusage of genes in organisms was made searchableby name of organism through a web site http://www.dna.affrc.go.jp/ approximately nakamura/CUTG.html The compilation is synchronized with major release of GenBank.  相似文献   

4.
Frequencies for each of the 206 526 complete protein-coding genes (CDS's) have been compiled from taxonomical divisions of the GenBank DNA sequence database. The sum of the codon use of 7434 organisms has also been calculated. These data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of the codon usage of genes in an organism as well as the sum of the codon usage of the organism was made searchable by the name of organism through a web site http://www.dna.affrc.go.jp//CUTG.html  相似文献   

5.
6.
Summary The nature and extent of DNA sequence divergence between homologous proteincoding genes fromEscherichia coli andSalmonella typhimurium have been examined. The degree of divergence varies greatly among genes at both synonymous (silent) and nonsynonymous sites. Much of the variation in silent substitution rates can be explained by natural selection on synonymous codon usage, varying in intensity with gene expression level. Silent substitution rates also vary significantly with chromosomal location, with genes nearoriC having lower divergence. Certain genes have been examined in more detail. In particular, the duplicate genes encoding elongation factor Tu,tufA andtufB, fromS. typhimurium have been compared to theirE. coli homologues. As expected these very highly expressed genes have high codon usage bias and have diverged very little between the two species. Interestingly, these genes, which are widely spaced on the bacterial chromosome, also appear to be undergoing concerted evolution, i.e., there has been exchange between the loci subsequent to the divergence of the two species.Presented at the NATO Advanced Research Workshop on Genome Organization and Evolution, held in Spetses, Greece, September 1990  相似文献   

7.
We analyze the frequencies of synonymous codons in animal mitochondrial genomes, focusing particularly on mammals and fish. The frequencies of bases at 4-fold degenerate sites are found to be strongly influenced by context-dependent mutation, which causes correlations between pairs of neighboring bases. There is a pattern of excess of certain dinucleotides and deficit of others that is consistent across large numbers of species, despite the wide variation of single-nucleotide frequencies among species. In many bacteria, translational selection is an important influence on codon usage. In order to test whether translational selection also plays a role in mitochondria, we need to control for context-dependent mutation. Selection for translational accuracy can be detected by comparison of codon usage in conserved and variable sites in the same genes. We give a test of this type that works in the presence of context-dependent mutation. There is very little evidence for translational accuracy selection in the mitochondrial genes considered here. Selection for translational efficiency might lead to preference for codons that match the limited repertoire of anticodons on the mitochondrial tRNAs. This is difficult to detect because the effect would usually be in the same direction in comparable to codon families and so would not cause an observable difference in codon usage between families. Several lines of evidence suggest that this type of selection is weak in most cases. However, we found several cases where unusual bases occur at the wobble position of the tRNA, and in these cases, some evidence for selection on codon usage was found. We discuss the way that these unusual cases are associated with codon reassignments in the mitochondrial genetic code.  相似文献   

8.
ACNUC is a database structure and retrieval software for usewith either the GenBank or EMBL nucleic acid sequence data collections.The nucleotide and textual data furnished by both collectionsare each restructured into a database that allows sequence retrievalon a multi-criterion basis. The main selection criteria are:species (or higher order taxon), keyword, reference, journal,author, and organelle; all logical combinations of these criteriacan be used. Direct access to sequence regions that code fora specific product (protein, tRNA or rRNA) is provided. A versatileextraction procedure copies selected sequences, or fragmentsof them, from the database to user files suitable to be analysedby user-supplied application programs. A detailed help mechanismis provided to aid the user at any time during the retrievalsession. All software has been written in FORTRAN 77 which guaranteesa high degree of transportability to minicomputers or mainframes.reference, journal, author, and organelle; all logical combinationsof these criteria can be used. Direct access to sequence regionsthat code for a specific product (protein, tRNA or rRNA) isprovided. A versatile extraction procedure copies selected sequences,or fragments of them, from the database to user files suitableto be analysed by user-supplied application programs. A detailedhelp mechanism is provided to aid the user at any time duringthe retrieval session. All software has been written in FORTRAN77 which guarantees a high degree of transportability to minicomputersor mainframes. Received on May 1, 1985; accepted on June 13, 1985  相似文献   

9.
CyanoBase provides an online resource for access to data on genomic information about the cyanobacterium Synechocystis sp. strain PCC6803. The database contains annotations for each protein-coding gene deduced from the entire nucleotide sequence of the genome, gene classification lists, and keyword and similarity search engines. Core portions of CyanoBase consist of annotations for each of the 3168 protein genes deduced from the entire nucleotide sequence of this genome. The contents of each gene were improved by updating with the results of similarity searches and by introducing references for analysis in bioinformatics. The database now contains repository facilities that store and provide experimental information, in addition to providing proposals for the function of each gene. This information should help to avoid unnecessary, overlapping experiments and should assist communication between scientists who wish to elucidate the function of putative genes on the cyanobacteria genome. The current URL of CyanoBase is http://www.kazusa.or.jp:8080/cyano/  相似文献   

10.
11.
Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.  相似文献   

12.
This paper provides an overview of the advances in the estimation of genetic risks of exposure of human populations to ionizing radiation with particular emphasis on the advances during the last decade. Among the latter are: (a) an upward revision of the estimates of the baseline frequencies of Mendelian diseases (from 1.25 to 2.4%); (b) the conceptual change to the use of a doubling dose based on human data on spontaneous mutation rates and mouse data on induced mutation rates (from the one based entirely on mouse data on spontaneous and induced mutation rates, which was the case thus far); (c) the fuller development of the concept of mutation component (MC) and its application to predict the responsiveness of Mendelian and chronic multifactorial diseases to induced mutations; (d) the concept that the major adverse effects of radiation exposure of human germ cells are likely to be manifest as multi-system developmental abnormalities and (e) the concept of potential recoverability correction factor (PRCF) to bridge the gap between induced mutations studied in mice and the risk of genetic disease in humans. For a population exposed to low LET, chronic/low dose-rate irradiation, the current estimates of risk for the first generation progeny are the following (all estimates per million live born progeny per Gy of parental irradiation): autosomal dominant and X-linked diseases, approximately 750 to 1,500 cases; autosomal recessive, nearly zero; chronic multifactorial diseases, approximately 250 to 1,200 cases and congenital abnormalities, approximately 2,000 cases. The total risk per Gy is of the order of approximately 3,000 to 4,700 cases which represent approximately 0.4 to 0.6% of the baseline frequency of these diseases. The main message is that at low doses of radiation of interest in risk estimation, the risk of adverse hereditary effects is small.  相似文献   

13.
14.
The expression efficiency was improved for the recombinant single-chain variable fragment (scFv) against clenbuterol (CBL) obtained from mouse and expressed in the methylotrophic yeast Pichia pastoris GS115, by redesigning and synthesizing the DNA sequence encoding for CBL-scFv based on the codon bias of P. pastoris. The codons enco4ding 124 amino acids were optimized, in which a total of 156 nucleotides were changed, and the G+C ratio was simultaneously decreased from 53 to 47.2 %. Under the optimized expression conditions, the yield of the recombinant CBL-scFv (41 kDa) antibodies was 0.223 g L–1 in shake culture. Compared to the non-optimized control, the expression level of the optimized recombinant CBL-scFv based on preferred codons in P. pastoris demonstrated a 2.35-fold higher yield. Furthermore, the recombinant CBL-scFv was purified by Ni-NTA column chromatography, and the purity was 95 %. The purified CBL-scFv showed good CBL recognition by a competitive indirect enzyme-linked immunoassay. The average concentration required for 50 % inhibition of binding and the limit of detection for the assay were 5.82 and 0.77 ng mL–1, respectively.  相似文献   

15.
Nucleotide sequence databases: a gold mine for biologists.   总被引:5,自引:0,他引:5  
  相似文献   

16.
17.
In the wake of the numerous now-fruitful genome projects, we have witnessed a 'tsunami' of sequence data and with it the birth of the field of bioinformatics. Bioinformatics involves the application of information technology to the management and analysis of biological data. For many of us, this means that databases and their search tools have become an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments, even for the cognoscenti. Moreover, increasing amounts of sequence information do not necessarily equate with an increase in knowledge, and in the panic to automate the route from raw data to biological insight, we may be generating and propagating innumerable errors in our precious databases. In the genome era upon us, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. For the pharmaceutical industry in particular, the Pandora's box of bioinformatics harbours an information-rich nugget, ripe with potential drug targets and possible new avenues for the development of therapeutic agents. This review outlines the current status of the major pattern databases now used routinely in the analysis of protein sequences. The review is divided into three main sections. In the first, commonly used terms are defined and the methods behind the databases are briefly described; in the second, the structure and content of the principal pattern databases are discussed; and in the final part, several alignment databases, which are frequently confused with pattern databases, are mentioned. For the new-comer, the array of resources, the range of methods behind them and the different tools required to search them can be confusing. The review therefore also briefly mentions a current international endeavour to integrate the diverse databases, which effort should facilitate sequence analysis in the future. This is particularly important for target-discovery programmes, where the challenge is to rationalise the enormous numbers of potential targets generated by sequence database searches. This problem may be addressed, at least in part, by reducing search outputs to the more focused and manageable subsets suggested by searches of integrated groups of family-specific pattern databases.  相似文献   

18.
H J?rnvall 《FEBS letters》1999,456(1):85-88
Motifer is a software tool able to find directly in nucleotide databases very distant homologues to an amino acid query sequence. It focuses searches on a specific amino acid pattern, scoring the matching and intervening residues as specified by the user. The program has been developed for searching databases of expressed sequence tags (ESTs), but it is also well suited to search genomic sequences. The query sequence can be a variable pattern with alternative amino acids or gaps and the sequences searched can contain introns or sequencing errors with accompanying frame shifts. Other features include options to generate a searchable output, set the maximal sequencing error frequency, limit searches to given species, or exclude already known matches. Motifer can find sequence homologues that other search algorithms would deem unrelated or would not find because of sequencing errors or a too large number of other homologues. The ability of Motifer to find relatives to a given sequence is exemplified by searches for members of the transforming growth factor-beta family and for proteins containing a WW-domain. The functions aimed at enhancing EST searches are illustrated by the 'in silico' cloning of a novel cytochrome P450 enzyme.  相似文献   

19.
The increasing popularity of DNA chip technology for the study of gene expression is producing, for each experiment, a sizable quantity of numerical data to analyse and an accompanying large number of gene identifiers that should be associated with the relevant biological annotation. We describe here a website at IFOM (FIRC Institute of Molecular Oncology) where we release regularly updated annotation tables for the most used Affymetrix oligonucleotide DNA chips and for the whole Research Genetics 46K clone collection for cDNA arrays. These tables are synchronised with every new release of the mouse and human UniGene databases (NCBI; National Center for Biotechnology Information), allowing fast and easy preliminary annotation of DNA array experiments. We also report some comparative evidence about the importance of biological database synchronisation and cross-references in the process of generating annotation tables for DNA chips.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号