首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A mutation spectra database for bacterial and mammalian genes.   总被引:1,自引:0,他引:1       下载免费PDF全文
Each mutation spectrum in this database is a dataset of changes in DNA base sequence in mutations induced in a gene by a particular mutagen (including spontaneous processes) under defined conditions. There are 240 datasets with 24 500 mutants in nine bacterial genes, two phage genes, five mammalian genes and one yeast gene. The database is available on the Web at http://info.med.yale.edu/mutbase/ . The data tables can be viewed on the Web and downloaded in text form for local use. The data are also available in dBASE III, a format which can be utilized by essentially any desktop computer database program or spreadsheet, and makes feasible analyses of a large number of mutants. Researchers are invited to submit additional data. A data entry program, MUTSIN, diagrams each mutation on the computer screen as the data are entered and alerts the user to any discrepancies between the entry and the gene sequence.  相似文献   

2.
3.
The Yale database contains sequence changes in mutations induced in a number of bacterial, mammalian and yeast genes. It contains data in electronic form on more than 17,000 mutations (July, 1994), is periodically updated, and is available without cost on Internet and on diskettes. Researchers are invited to contribute additional results; a data entry program, MUSTIN, is provided to facilitate adding new data and to minimize errors.  相似文献   

4.
IRIS: a database surveying known human immune system genes   总被引:4,自引:0,他引:4  
Kelley J  de Bono B  Trowsdale J 《Genomics》2005,85(4):503-511
We have compiled an online database of known human defense genes: the Immunogenetic Related Information Source (IRIS). As of October 1, 2004, there are 1562 immune genes recorded in IRIS, representing 7% of the human genome. This resource contains searchable information including chromosomal location, sequence data, and a curated functional annotation for each entry. We used IRIS as a basis for analyzing the composition and characteristics of the immune genome, such as gene clustering, polymorphism, and relationship to disease. High protein sequence similarity correlated inversely with distance between immune genes, consistent with clustering of duplicated loci. We also found that, even though some immune genes exhibit high levels of polymorphism, such as MHC class I, the range of levels of polymorphism in immune genes is similar to that of nonimmune genes. Approximately 20% of immune genes have a known disease association. IRIS is available online at .  相似文献   

5.
Ligand-Gated Ion Channels (LGIC) are polymeric transmembrane proteins involved in the fast response to numerous neurotransmitters. All these receptors are formed by homologous subunits and the last two decades revealed an unexpected wealth of genes coding for these subunits. The Ligand-Gated Ion Channel database (LGICdb) has been developed to handle this increasing amount of data. The database aims to provide only one entry for each gene, containing annotated nucleic acid and protein sequences. The repository is carefully structured and the entries can be retrieved by various criteria. In addition to the sequences, the LGICdb provides multiple sequence alignments, phylogenetic analyses and atomic coordinates when available. The database is accessible via the World Wide Web (http://www.pasteur.fr/recherche/banques/LGIC /LGIC.html), where it is continuously updated. The version 16 (September 2000) available for download contained 333 entries covering 34 species.  相似文献   

6.
A relational database model for describing DNA mutations is presented. The model was developed in conjunction with the human hprt database and was succesful in representing over 1800 hprt mutations. Mutants showing aberrant mRNA splicing can be adequately described using the model, as well as mutants showing more than one mutation. The basic aspects of the relational model should be applicable to mutations in a variety of genes. A data entry program developed using Microsoft Access 2.0 is also described that implements the relational model The data entry program ensures that relational integrity is maintained between the tables and automatically generates key fields as needed. The program also has the ability to convert between the various numbering schemes that are used to decribed base pair location in the hprt gene. The program and source code are placed in the public domain so that other experimenters can adapt the program for use with other genes.  相似文献   

7.
This database consists of over 24 000 mutations in 18 viral, bacterial, yeast or mammalian genes. The data are grouped as sets of DNA base sequence changes or spectra caused by a particular mutagen under defined conditions. The spectra are available on the World Wide Web at http://info.med.yale.edu/mutbase/ in two formats; in text format that can be browsed on-line or downloaded for use with a text editor and in dBASEIII format for use, after downloading, by relational database programs or by spreadsheets. Researchers are encouraged to submit DNA sequence changes to a suitable mutation database such as ours. A data entry program, MUTSIN, can be retrieved from this site. MUTSIN diagrams each mutation on the computer screen and alerts the user to any discrepancies.  相似文献   

8.
9.
10.
We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship.  相似文献   

11.
Dondeti VR  Sipe CW  Saha MS 《BioTechniques》2004,37(5):768-70, 772, 774-6
Microarray technology has become an important tool for studying large-scale gene expression for a diversity of biological applications. However, there are a number of experimental settings for which commercial arrays are either unsuitable or unavailable despite the existence of sequence information. With the increasing availability of custom array manufacturing services, it is now feasible to design high-density arrays for any organism having sequence data. However, there have been relatively few reports discussing gene selection, an important first step in array design. Here we propose an in silico strategy for custom microarray gene selection that is applicable to a wide range of organisms, based on utilizing public domain microarray information to interrogate existing sequence data and to identify a set of homologous genes in any organism of interest. We demonstrate the utility of this approach by applying it to the selection of candidate genes for a custom Xenopus laevis microarray. A significant finding of this study is that 3%-4% of Xenopus expressed sequence tags (ESTs) are in an orientation contrary to that indicated in the public database entry (http://mssaha.people.wm.edu/suppMSS.html).  相似文献   

12.
We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS).  相似文献   

13.
We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. As in previous editions the genetic names are consistently associated to each sequence with a known and confirmed ORF. If necessary, synonyms are given in the case of allelic duplicated sequences. Although the first publication of a sequence gives-according to our rules-the genetic name of a gene, in some instances more commonly used names are given to avoid nomenclature problems and the use of ancient designations which are no longer used. In these cases the old designation is given as synonym. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, SWISSPROT and EMBL accession numbers. New entries will also contain the name from the systematic sequencing efforts. Since the release of LISTA4.1 we update the database continuously. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. This release includes reports from full Smith and Watermann peptide-level searches against a non-redundant protein sequence database. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). The database is available by FTP and on World Wide Web.  相似文献   

14.
To deduce the entire sequence of the top arm of the Arabidopsis thaliana chromosome 3, the sequence determination was performed on a total of 90 P1, TAC and BAC clones chosen according to our sequencing strategy. Sequence features of the resulting 4,251,695 bp regions were analyzed with various computer programs for similarity search and gene modeling. As a result, a total of 941 potential protein-coding genes were identified. The average density of the genes identified was 1 gene per 4210 bp. Introns were observed in 73% of the genes, and the average number per gene and the average length of the introns were 3.6 and 159 bp, respectively. These sequence features are essentially identical to those of chromosomes 3 and 5 in our previous reports. The regions also contained 14 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.  相似文献   

15.
The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry.  相似文献   

16.
Chlamydomonas reinhardtii is a unicellular green alga that is a key model organism in the study of photosynthesis and oxidative stress. Here we describe the large‐scale generation of a population of insertional mutants that have been screened for phenotypes related to photosynthesis and the isolation of 459 flanking sequence tags from 439 mutants. Recent phylogenomic analysis has identified a core set of genes, named GreenCut2, that are conserved in green algae and plants. Many of these genes are likely to be central to the process of photosynthesis, and they are over‐represented by sixfold among the screened insertional mutants, with insertion events isolated in or adjacent to 68 of 597 GreenCut2 genes. This enrichment thus provides experimental support for functional assignments based on previous bioinformatic analysis. To illustrate one of the uses of the population, a candidate gene approach based on genome position of the flanking sequence of the insertional mutant CAL027_01_20 was used to identify the molecular basis of the classical C. reinhardtii mutation ac17. These mutations were shown to affect the gene PDH2, which encodes a subunit of the plastid pyruvate dehydrogenase complex. The mutants and associated flanking sequence data described here are publicly available to the research community, and they represent one of the largest phenotyped collections of algal insertional mutants to date.  相似文献   

17.
Conserved sequence amplification (CSA) has been used to obtain sequence data for two glycosidase genes from the primitive eukaryote Tritrichomonas foetus. Few genes have been cloned from this organism, and there is little information concerning protein sequence. CSA is reliant on the use of database searches to identify short sequences of 3–9 amino acids conserved within a protein across a wide range of species. PCR primers are then constructed based on this sequence data and the DNA is amplified and sequenced. In the case of the β-galactosidase gene, N-terminal amino acid sequence data were used to construct a primer that replaced the upstream primer to ensure the amplified product was related to β-d galactosidase CSA was also applied to the gene encoding the enzyme β-N-acetyl-d-glucosaminidase from T. foetus, but in this case a segment of DNA was amplified, which, if correct, should contain a third conserved motif. The products of the CSA were sequenced, and the data obtained were compared to data in the SwissProt database. The results obtained suggest that this approach is useful for the cloning of genes to obtain novel sequence data from organisms where little genetic information is available.  相似文献   

18.
FlyBase (http://flybase.bio.indiana.edu/) is a comprehensive database of genetic and molecular data concerning Drosophila . FlyBase is maintained as a relational database (in Sybase) and is made available as html documents and flat files. The scope of FlyBase includes: genes, alleles (with phenotypes), aberrations, transposons, pointers to sequence data, gene products, maps, clones, stock lists, Drosophila workers and bibliographic references.  相似文献   

19.
Complete structure of the chloroplast genome of a legume, Lotus japonicus.   总被引:4,自引:0,他引:4  
The nucleotide sequence of the entire chloroplast genome (150,519 bp) of a legume, Lotus japonicus, has been determined. The circular double-stranded DNA contains a pair of inverted repeats of 25,156 bp which are separated by a small and a large single copy region of 18,271 bp and 81,936 bp, respectively. A total of 84 predicted protein-coding genes including 7 genes duplicated in the inverted repeat regions, 4 ribosomal RNA genes and 37 tRNA genes (30 gene species) representing 20 amino acids species were assigned on the genome based on similarity to genes previously identified in other chloroplasts. All the predicted genes were conserved among dicot plants except that rpl22, a gene encoding chloroplast ribosomal protein CL22, was missing in L. japonicus. Inversion of a 51-kb segment spanning rbcL to rpsl6 (positions 5161-56,176) in the large single copy region was observed in the chloroplast genome of L. japonicus. The sequence data and gene information are available on our World Wide Web database at http://www.kazusa.or.jp/en/plant/database.html.  相似文献   

20.
Finding the position of a gene is now easily done when the genome sequence is available: the gene position is generally found by a simple query of genomic databases such as those available at the Ensembl browser or the NCBI. We were interested in determining the position of 125 cancer-related rat genes and we found that the position of most of these genes (110) could indeed be identified in this manner. However, in 15 cases, the gene position was not available in these databases, or the results were ambiguous. We then explored a more specialized database, namely the Rat Genome Database, and experimentally mapped these genes using standard and radiation cell hybrids. The 15 genes in question could be localized unambiguously. In four cases, the radiation cell hybrids were indispensable: the sequence of these four genes could not be found in the rat genome sequence. On the basis of the sample we examined, it thus appears that a classical gene mapping method is still required to localize about 3% of the rat genes, as if 3% of the rat gene sequences were lacking in the current rat genome sequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号