首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 953 毫秒
1.
MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces cerevisiae. MitBASE reports all available information from different organisms and from intraspecies variants and mutants. Data have been drawn from the primary databases and from the literature; value adding information has been structured, e.g., editing information on protist mtDNA genomes, pathological information for human mtDNA variants, etc. The different databases, some of which are structured using commercial packages (Microsoft Access, File Maker Pro) while others use a flat-file format, have been integrated under ORACLE. Ad hoc retrieval systems have been devised for some of the above listed databases keeping into account their peculiarities. The database is resident at the EBI and is available at the following site: http://www3.ebi.ac.uk/Research/Mitbase/mitbas e.pl. The impact of this project is intended for both basic and applied research. The study of mitochondrial genetic diseases and mitochondrial DNA intraspecies diversity are key topics in several biotechnological fields. The database has been funded within the EU Biotechnology programme.  相似文献   

2.
MitBASE is a comprehensive and integrated mitochondrial genome database funded within the EU BIOTECH PROGRAM. It is a project for the development and implementation of an integrated and comprehensive database of mitochondrial data which will collect all available information from different organisms and from intraspecies variants and mutants. The present paper describes the structure of the Human dataset in mitBASE where human molecular data are distinguished from clinical and pathological data. MitBASE home page address is: http://www.ebi.ac.uk/htbin/Mitbase/mitb ase.pl  相似文献   

3.
4.
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. AVAILABILITY: The database is available for free at http://mail.nbfgr.res.in/fbis/  相似文献   

5.
环境 DNA (eDNA) 技术是一种生态和生物多样性监测和评价的新手段, 完整和准确的参考序列库是eDNA技术应用于水生生物多样性调查的基础。当前, 不同水生生物eDNA参考序列还存在诸多问题, 如不同类群使用的标记基因不同且资源较为分散, 部分参考序列分类不准确, 以及针对我国各类水体中水生生物eDNA参考序列不多等。针对上述问题, 研究构建了水生生物eDNA数据库(AeDNA, http://aedna.ihb.ac.cn/)。 AeDNA整合了DNA条形码和基因组两种类型参考序列。其中18S、28S、ITS、COΙ、12S、rbcL 等各类DNA条形码60余万条, 涉及2万余种鱼类、1万余种水生植物、1万余种底栖动物、1万余种浮游动物和1万余种浮游植物; 基因组包含线粒体、叶绿体等细胞器基因组6199个及万种鱼类基因组计划和万种原生生物基因组计划所产生的物种基因组。涉及的生境有江、河、湖、海、冰川和温泉等各类水环境, 尤其数据库构建团队贡献的6万余条参考序列, 具有我国丰富的各类水体生境信息。总体来说, AeDNA是一个数据量大、类群覆盖全、准确性高且具有我国水生生物特色的综合性eDNA参考序列库, 是水生态和水生生物多样性监测的重要基础资源。  相似文献   

6.
GEO(Gene Expression Omnibus ):高通量基因表达数据库   总被引:2,自引:0,他引:2  
 GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10 000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/.  相似文献   

7.
GenBank   总被引:51,自引:4,他引:47       下载免费PDF全文
The GenBank((R))sequence database incorporates publicly available DNA sequences of >55 000 different organisms, primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (Web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping and protein structure information, plus the biomedical literature via PubMed. Sequence similarity searching is provided by the BLAST family of programs. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. NCBI also offers a wide range of WWW retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the NCBI home page at http://www.ncbi.nlm.nih.gov  相似文献   

8.
GenBank          下载免费PDF全文
GenBank (R) is a comprehensive sequence database that contains publicly available DNA sequences for more than 119 000 different organisms, obtained primarily through the submission of sequence data from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the BankIt (web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps ensure worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI home page at: http://www.ncbi.nlm.nih.gov.  相似文献   

9.
Small RNA database.   总被引:2,自引:0,他引:2       下载免费PDF全文
The small RNA database is a compilation of all the small size RNA sequences available to date, including nuclear, nucleolar, cytoplasmic and mitochondrial small RNAs from eukaryotic organisms and small RNAs from prokaryotic cells as well as viruses. Currently, about 600 small RNA sequences are in our database. It also gives the sources of individual RNAs and their GenBank accession numbers. The small RNA database can be accessed through WWW(World Wide Web). Our WWW URL address is: http://mbcr.bcm.tmc.edu/smallRNA/smallrna. html . The new small RNA sequences published since our last compilation are listed in this paper.  相似文献   

10.
A quality control algorithm for DNA sequencing projects.   总被引:2,自引:0,他引:2       下载免费PDF全文
Heterologous DNA sequences from rearrangements with the genomes of host cells, genomic fragments from hybrid cells, or impure tissue sources can threaten the purity of libraries that are derived from RNA or DNA. Hybridization methods can only detect contaminants from known or suspected heterologous sources, and whole library screening is technically very difficult. Detection of contaminating heterologous clones by sequence alignment is only possible when related sequences are present in a known database. We have developed a statistical test to identify heterologous sequences that is based on the differences in hexamer composition of DNA from different organisms. This test does not require that sequences similar to potential heterologous contaminants are present in the database, and can in principle detect contamination by previously unknown organisms. We have applied this test to the major public expressed sequence tag (EST) data sets to evaluate its utility as a quality control measure and a peer evaluation tool. There is detectable heterogeneity in most human and C.elegans EST data sets but it is not apparently associated with cross-species contamination. However, there is direct evidence for both yeast and bacterial sequence contamination in some public database sequences annotated as human. Results obtained with the hexamer test have been confirmed with similarity searches using sequences from the relevant data sets.  相似文献   

11.
Vertebrate MitBASE is a specialized database where all the vertebrate mitochondrial DNA entries from primary databases are collected, revised and integrated with new information emerging from the literature. Variant sequences are also analyzed, aligned and linked to reference sequences. Data related to the same species and fragment can be viewed over the WWW. The database has a flexible interface and a retrieval system to help non-expert users and contains information not currently available in the primary databases. Vertebrate MitBASE is now available through the MitBASE home page at URL: http://www.ebi.ac.uk/htbin/Mitbase/mitb ase.pl. This work is part of a larger project, MitBASE which is a network of databases covering the full panorama of knowledge on mitochondrial DNA from protists to human sequences.  相似文献   

12.
Methylation of cytosine in the 5 position of the pyrimidine ring is a major modification of the DNA in most organisms. In eukaryotes, the distribution and number of 5-methylcytosines (5mC) along the DNA is heritable but can also change with the developmental state of the cell and as a response to modifications of the environment. While DNA methylation probably has a number of functions, scientific interest has recently focused on the gene silencing effect methylation can have in eukaryotic cells. In particular, the discovery of changes in the methylation level during cancer development has increased the interest in this field. In the past, a vast amount of data has been generated with different levels of resolution ranging from 5mC content of total DNA to the methylation status of single nucleotides. We present here a database for DNA methylation data that attempts to unify these results in a common resource. The database is accessible via WWW (http://www.methdb.de). It stores information about the origin of the investigated sample and the experimental procedure, and contains the DNA methylation data. Query masks allow for searching for 5mC content, species, tissue, gene, sex, phenotype, sequence ID and DNA type. The output lists all available information including the relative gene expression level. DNA methylation patterns and methylation profiles are shown both as a graphical representation and as G/A/T/C/5mC-sequences or tables with sequence positions and methylation levels, respectively.  相似文献   

13.
GenBank          下载免费PDF全文
The GenBank sequence database incorporates publicly available DNA sequences of more than 105 000 different organisms, primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI’s integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical literature via PubMed. Sequence similarity searching is provided by the BLAST family of programs. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. NCBI also offers a wide range of World Wide Web retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the NCBI home page at http://www.ncbi.nlm.nih.gov.  相似文献   

14.
MOTIVATION: Understanding the basis of protein stability in thermophilic organisms raises a general question: what structural properties of proteins are responsible for the higher thermostability of proteins from thermophilic organisms compared to proteins from mesophilic organisms? RESULTS: A unique database of 373 structurally well-aligned protein pairs from thermophilic and mesophilic organisms is constructed. Comparison of proteins from thermophilic and mesophilic organisms has shown that the external, water-accessible residues of the first group are more closely packed than those of the second. Packing of interior parts of proteins (residues inaccessible to water molecules) is the same in both cases. The analysis of amino acid composition of external residues of proteins from thermophilic organisms revealed an increased fraction of such amino acids as Lys, Arg and Glu, and a decreased fraction of Ala, Asp, Asn, Gln, Thr, Ser and His. Our theoretical investigation of folding/unfolding behavior confirms the experimental observations that the interactions that differ in thermophilic and mesophilic proteins form only after the passing of the transition state during folding. Thus, different packing of external residues can explain differences in thermostability of proteins from thermophilic and mesophilic organisms. AVAILABILITY: The database of 373 structurally well-aligned protein pairs is available at http://phys.protres.ru/resources/termo_meso_base.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

15.
Summary The informational content of genomes of nuclear and mitochondrial origin is examined. By using the parameters of Shannon's information theory the language of mitochondrial DNA is shown to be more similar to the language of bacterial DNA than to that of nuclear DNA in more evolutionarily advanced animals. Moreover, using the parameters of Kolmogorov's theory on randomness, genes of different organisms (Neurospora crassa andSaccharomyces cerevisiae) coding for the same protein (subunit 9 of ATPase) are shown to have, if both of mitochondrial origin, a similar degree of randomness, whereas genes coding for the same protein, both belonging to the same organisms, exhibit a quite different degree of randomness when one is of mitochondrial origin and the other of nuclear origin. These results are in favor of the symbiotic origin of mitochondria.  相似文献   

16.
The frequencies of each of the 257 468 complete protein coding sequences (CDSs) have been compiled from the taxonomical divisions of the GenBank DNA sequence database. The sum of the codons used by 8792 organisms has also been calculated. The data files can be obtained from the anonymous ftp sites of DDBJ, Kazusa and EBI. A list of the codon usage of genes and the sum of the codons used by each organism can be obtained through the web site http://www.kazusa.or.jp/codon/ . The present study also reports recent developments on the WWW site. The new web interface provides data in the CodonFrequency-compatible format as well as in the traditional table format. The use of the database is facilitated by keyword based search analysis and the availability of codon usage tables for selected genes from each species. These new tools will provide users with the ability to further analyze for variations in codon usage among different genomes.  相似文献   

17.
Although the massive sequencing of mitochondrial DNA from various organisms, together with studies of a different nature, has contributed enormously to the knowledge of the organization and function of this cytoplasmic genome, many issues, mainly the relationships with the nuclear genome, remain unsolved. This review critically evaluates the most recent advances in research on the evolution of the mitochondrial DNA from a qualitative and quantitative point of view, underlining the multiplicity of structures and genetic organization of this genome, which contrasts with its reduced, but rather constant, information content in various organisms. It also highlights the role that mitochondrial DNA is now playing, particularly in metazoans, in different disciplines and application fields. Among these, particular attention is focused on the discovery of the mitochondrial origin of several diseases affecting primarily the neuromuscular system.  相似文献   

18.
Mamit-tRNA (http://mamit-tRNA.u-strasbg.fr), a database for mammalian mitochondrial genomes, has been developed for deciphering structural features of mammalian mitochondrial tRNAs and as a helpful tool in the frame of human diseases linked to point mutations in mitochondrial tRNA genes. To accommodate the rapid growing availability of fully sequenced mammalian mitochondrial genomes, Mamit-tRNA has implemented a relational database, and all annotated tRNA genes have been curated and aligned manually. System administrative tools have been integrated to improve efficiency and to allow real-time update (from GenBank Database at NCBI) of available mammalian mitochondrial genomes. More than 3000 tRNA gene sequences from 150 organisms are classified into 22 families according to the amino acid specificity as defined by the anticodon triplets and organized according to phylogeny. Each sequence is displayed linearly with color codes indicating secondary structural domains and can be converted into a printable two-dimensional (2D) cloverleaf structure. Consensus and typical 2D structures can be extracted for any combination of primary sequences within a given tRNA specificity on the basis of phylogenetic relationships or on the basis of structural peculiarities. Mamit-tRNA further displays static individual 2D structures of human mitochondrial tRNA genes with location of polymorphisms and pathology-related point mutations. The site offers also a table allowing for an easy conversion of human mitochondrial genome nucleotide numbering into conventional tRNA numbering. The database is expected to facilitate exploration of structure/function relationships of mitochondrial tRNAs and to assist clinicians in the frame of pathology-related mutation assignments.  相似文献   

19.
A variety of forensic, population, and disease studies are based on haploid DNA (e.g. mitochondrial DNA or Y-chromosome data). For any set of genetic markers databases of conventional size will normally contain only a fraction of all haplotypes. For several applications, reliable estimates of haplotype frequencies, the total number of haplotypes and coverage of the database (the probability that the next random haplotype is contained in the database) will be useful. We propose different approaches to the problem based on classical methods as well as new applications of Principal Component Analysis (PCA). We also discuss previous proposals based on saturation curves. Several conclusions can be inferred from simulated and real data. First, classical estimates of the fraction of unseen haplotypes can be seriously biased. Second, there is no obvious way to decide on required sample size based on traditional approaches. Methods based on testing of hypotheses or length of confidence intervals may appear artificial since no single test or parameter stands out as particularly relevant. Rather the coverage may be more relevant since it indicates the percentage of different haplotypes that are contained in a database; if the coverage is low, there is a considerable chance that the next haplotype to be observed does not appear in the database and this indicates that the database needs to be expanded. Finally, freeware and example data sets accompany the methods discussed in this paper: http://folk.uio.no/thoree/nhap/.  相似文献   

20.
The origins of the Etruscans, a non-Indo-European population of preclassical Italy, are unclear. There is broad agreement that their culture developed locally, but the Etruscans' evolutionary and migrational relationships are largely unknown. In this study, we determined mitochondrial DNA sequences in multiple clones derived from bone samples of 80 Etruscans who lived between the 7th and the 3rd centuries b.c. In the first phase of the study, we eliminated all specimens for which any of nine tests for validation of ancient DNA data raised the suspicion that either degradation or contamination by modern DNA might have occurred. On the basis of data from the remaining 30 individuals, the Etruscans appeared as genetically variable as modern populations. No significant heterogeneity emerged among archaeological sites or time periods, suggesting that different Etruscan communities shared not only a culture but also a mitochondrial gene pool. Genetic distances and sequence comparisons show closer evolutionary relationships with the eastern Mediterranean shores for the Etruscans than for modern Italian populations. All mitochondrial lineages observed among the Etruscans appear typically European or West Asian, but only a few haplotypes were found to have an exact match in a modern mitochondrial database, raising new questions about the Etruscans' fate after their assimilation into the Roman state.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号