首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
5S rRNA Data Bank.   总被引:6,自引:3,他引:3       下载免费PDF全文
In this paper we present the updated version of the compilation of 5S rRNA and 5S rDNA nucleotide sequences. It contains 1622 primary structures of 5S rRNAs and 5S rRNA genes from 888 species. These include 58 archaeal, 427 eubacterial, 34 plastid, nine mitochondrial and 1094 eukaryotic DNA or RNA nucleotide sequences. The sequence entries are divided according to the taxonomic position of the organisms. All individual sequences deposited in the 5S rRNA Database can be retrieved using the WWW-based, taxonomic browser at http://rose.man.poznan.pl/5SData/5SRNA.html++ + or http://www.chemie. fu-berlin.de/fb_chemie/agerdmann/5S_rRNA.html . The files with complete sets of data as well as sequence alignments are available via anonymous ftp.  相似文献   

2.
We present an interactive web application for visualizing genomic data of prokaryotic chromosomes. The tool (GeneWiz browser) allows users to carry out various analyses such as mapping alignments of homologous genes to other genomes, mapping of short sequencing reads to a reference chromosome, and calculating DNA properties such as curvature or stacking energy along the chromosome. The GeneWiz browser produces an interactive graphic that enables zooming from a global scale down to single nucleotides, without changing the size of the plot. Its ability to disproportionally zoom provides optimal readability and increased functionality compared to other browsers. The tool allows the user to select the display of various genomic features, color setting and data ranges. Custom numerical data can be added to the plot allowing, for example, visualization of gene expression and regulation data. Further, standard atlases are pre-generated for all prokaryotic genomes available in GenBank, providing a fast overview of all available genomes, including recently deposited genome sequences. The tool is available online from http://www.cbs.dtu.dk/services/gwBrowser. Supplemental material including interactive atlases is available online at http://www.cbs.dtu.dk/services/gwBrowser/suppl/.  相似文献   

3.
4.
5.
Identifying clusters of functionally related genes in genomes   总被引:4,自引:0,他引:4  
MOTIVATION: An increasing body of literature shows that genomes of eukaryotes can contain clusters of functionally related genes. Most approaches to identify gene clusters utilize microarray data or metabolic pathway databases to find groups of genes on chromosomes that are linked by common attributes. A generalized method that can find gene clusters regardless of the mechanism of origin would provide researchers with an unbiased method for finding clusters and studying the evolutionary forces that give rise to them. RESULTS: We present an algorithm to identify gene clusters in eukaryotic genomes that utilizes functional categories defined in graph-based vocabularies such as the Gene Ontology (GO). Clusters identified in this manner need only have a common function and are not constrained by gene expression or other properties. We tested the algorithm by analyzing genomes of a representative set of species. We identified species-specific variation in percentage of clustered genes as well as in properties of gene clusters including size distribution and functional annotation. These properties may be diagnostic of the evolutionary forces that lead to the formation of gene clusters. AVAILABILITY: A software implementation of the algorithm and example output files are available at http://fcg.tamu.edu/C_Hunter/.  相似文献   

6.
Gene duplication and divergence is a major evolutionary force. Despite the growing number of fully sequenced genomes, methods for investigating these events on a genome-wide scale are still in their infancy. Here, we present SYNERGY, a novel and scalable algorithm that uses sequence similarity and a given species phylogeny to reconstruct the underlying evolutionary history of all genes in a large group of species. In doing so, SYNERGY resolves homology relations and accurately distinguishes orthologs from paralogs. We applied our approach to a set of nine fully sequenced fungal genomes spanning 150 million years, generating a genome-wide catalog of orthologous groups and corresponding gene trees. Our results are highly accurate when compared to a manually curated gold standard, and are robust to the quality of input according to a novel jackknife confidence scoring. The reconstructed gene trees provide a comprehensive view of gene evolution on a genomic scale. Our approach can be applied to any set of sequenced eukaryotic species with a known phylogeny, and opens the way to systematic studies of the evolution of individual genes, molecular systems and whole genomes. Supplementary information: Supplementary data are available at Bioinformatics online.  相似文献   

7.
8.
CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes   总被引:1,自引:0,他引:1  
MOTIVATION: The numbers of finished and ongoing genome projects are increasing at a rapid rate, and providing the catalog of genes for these new genomes is a key challenge. Obtaining a set of well-characterized genes is a basic requirement in the initial steps of any genome annotation process. An accurate set of genes is needed in order to learn about species-specific properties, to train gene-finding programs, and to validate automatic predictions. Unfortunately, many new genome projects lack comprehensive experimental data to derive a reliable initial set of genes. RESULTS: In this study, we report a computational method, CEGMA (Core Eukaryotic Genes Mapping Approach), for building a highly reliable set of gene annotations in the absence of experimental data. We define a set of conserved protein families that occur in a wide range of eukaryotes, and present a mapping procedure that accurately identifies their exon-intron structures in a novel genomic sequence. CEGMA includes the use of profile-hidden Markov models to ensure the reliability of the gene structures. Our procedure allows one to build an initial set of reliable gene annotations in potentially any eukaryotic genome, even those in draft stages. AVAILABILITY: Software and data sets are available online at http://korflab.ucdavis.edu/Datasets.  相似文献   

9.
The study of conserved gene clusters is important for understanding the forces behind genome organization and evolution, as well as the function of individual genes or gene groups. In this paper, we present a new model and algorithm for identifying conserved gene clusters from pairwise genome comparison. This generalizes a recent model called "gene teams." A gene team is a set of genes that appear homologously in two or more species, possibly in a different order yet with the distance of adjacent genes in the team for each chromosome always no more than a certain threshold. We remove the constraint in the original model that each gene must have a unique occurrence in each chromosome and thus allow the analysis on complex prokaryotic or eukaryotic genomes with extensive paralogs. Our algorithm analyzes a pair of chromosomes in O(mn) time and uses O(m+n) space, where m and n are the number of genes in the respective chromosomes. We demonstrate the utility of our methods by studying two bacterial genomes, E. coli K-12 and B. subtilis. Many of the teams identified by our algorithm correlate with documented E. coli operons, while several others match predicted operons, previously suggested by computational techniques. Our implementation and data are publicly available at euler.slu.edu/ approximately goldwasser/homologyteams/.  相似文献   

10.
Advances in the Exon-Intron Database (EID)   总被引:3,自引:0,他引:3  
  相似文献   

11.
MOTIVATION: The total order of the genes or markers on a chromosome inherent in its representation as a signed per-mutation must often be weakened to a partial order in the case of real data. This is due to lack of resolution (where several genes are mapped to the same chromosomal position) to missing data from some of the datasets used to compile a gene order, and to conflicts between these datasets. The available genome rearrangement algorithms, however, require total orders as input. A more general approach is needed to handle rearrangements of gene partial orders. RESULTS: We formalize the uncertainty in gene order data by representing a chromosome from each genome as a partial order, summarized by a directed acyclic graph (DAG). The rearrangement problem is then to infer a minimal sequence of reversals for transforming any topological sort of one DAG to any one of the other DAG. Each topological sort represents a possible linearization compatible with all the datasets on the chromosome. The set of all possible topological sorts is embedded in each DAG by appropriately augmenting the edge set, so that it becomes a general directed graph (DG). The DGs representing chromosomes of two genomes are combined to produce a bicoloured graph from which we extract a maximal decomposition into alternating coloured cycles, and from which, in turn, an optimal sequence of reversals can usually be identified. We test this approach on simulated incomplete comparative maps and on cereal chromosomal maps drawn from the Gramene browser.  相似文献   

12.
13.
Childs LH  Lisec J  Walther D 《Plant physiology》2012,158(4):1534-1541
High-throughput sequencing and genotyping methods are dramatically increasing the number of observable genetic intraspecies differences that can be exploited as genetic markers. In addition, automated phenotyping platforms and "omics" profiling technologies further enlarge the set of quantifiable macroscopic and molecular traits at an ever-increasing pace. Combined, both lines of technological advances create unparalleled opportunities to identify candidate gene regions and, ideally, even single genes responsible for observed variations in a particular trait via association studies. However, as of yet, this new potential is not sufficiently matched by enabling software solutions to easily exploit this wealth of genotype/phenotype information. We have developed Matapax, a Web-based platform to address this need. Initially, we built the infrastructure to support association studies in Arabidopsis (Arabidopsis thaliana) based on several genotyping efforts covering up to 1,375 Arabidopsis accessions. Based on the user-supplied trait information, associated single-nucleotide polymorphism markers and single-nucleotide polymorphism-harboring or -neighboring genes are identified using both the GAPIT and EMMA libraries developed for R. Additional interrogation is facilitated by displaying candidate regions and genes in a genome browser and by providing relevant annotation information. In the future, we plan to broaden the scope of organisms to other plant species as more genotype/phenotype information becomes available. Matapax is freely available at http://matapax.mpimp-golm.mpg.de and can be accessed using any internet browser.  相似文献   

14.
Recent genomic data analyses have revealed important underlying logics in eukaryotic gene regulation, such as CpG islands (CGIs)-dependent dual-mode gene regulation. In mammals, genes lacking CGIs at their promoters are generally regulated by interconversion between euchromatin and heterochromatin, while genes associated with CGIs constitutively remain as euchromatin. Whether a similar mode of gene regulation exists in non-mammalian species has been unknown. Here, through comparative epigenomic analyses, we demonstrate that the dual-mode gene regulation program is common in various eukaryotes, even in the species lacking CGIs. In cases of vertebrates or plants, we find that genes associated with high methylation level promoters are inactivated by forming heterochromatin and expressed in a context-dependent manner. In contrast, the genes with low methylation level promoters are broadly expressed and remain as euchromatin even when repressed by Polycomb proteins. Furthermore, we show that invertebrate animals lacking DNA methylation, such as fruit flies and nematodes, also have divergence in gene types: some genes are regulated by Polycomb proteins, while others are regulated by heterochromatin formation. Altogether, our study establishes gene type divergence and the resulting dual-mode gene regulation as fundamental features shared in a broad range of higher eukaryotic species.  相似文献   

15.
Summary: Major insights into the phylogenetic distribution, biochemistry, and evolutionary significance of organelles involved in ATP synthesis (energy metabolism) in eukaryotes that thrive in anaerobic environments for all or part of their life cycles have accrued in recent years. All known eukaryotic groups possess an organelle of mitochondrial origin, mapping the origin of mitochondria to the eukaryotic common ancestor, and genome sequence data are rapidly accumulating for eukaryotes that possess anaerobic mitochondria, hydrogenosomes, or mitosomes. Here we review the available biochemical data on the enzymes and pathways that eukaryotes use in anaerobic energy metabolism and summarize the metabolic end products that they generate in their anaerobic habitats, focusing on the biochemical roles that their mitochondria play in anaerobic ATP synthesis. We present metabolic maps of compartmentalized energy metabolism for 16 well-studied species. There are currently no enzymes of core anaerobic energy metabolism that are specific to any of the six eukaryotic supergroup lineages; genes present in one supergroup are also found in at least one other supergroup. The gene distribution across lineages thus reflects the presence of anaerobic energy metabolism in the eukaryote common ancestor and differential loss during the specialization of some lineages to oxic niches, just as oxphos capabilities have been differentially lost in specialization to anoxic niches and the parasitic life-style. Some facultative anaerobes have retained both aerobic and anaerobic pathways. Diversified eukaryotic lineages have retained the same enzymes of anaerobic ATP synthesis, in line with geochemical data indicating low environmental oxygen levels while eukaryotes arose and diversified.  相似文献   

16.
The Distributed Annotation System (DAS) is a protocol for easy sharing and integration of biological annotations. In order to visualize feature annotations in a genomic context a client is required. Here we present myKaryoView, a simple light-weight DAS tool for visualization of genomic annotation. myKaryoView has been specifically configured to help analyse data derived from personal genomics, although it can also be used as a generic genome browser visualization. Several well-known data sources are provided to facilitate comparison of known genes and normal variation regions. The navigation experience is enhanced by simultaneous rendering of different levels of detail across chromosomes. A simple interface is provided to allow searches for any SNP, gene or chromosomal region. User-defined DAS data sources may also be added when querying the system. We demonstrate myKaryoView capabilities for adding user-defined sources with a set of genetic profiles of family-related individuals downloaded directly from 23andMe. myKaryoView is a web tool for visualization of genomic data specifically designed for direct-to-consumer genomic data that uses publicly available data distributed throughout the Internet. It does not require data to be held locally and it is capable of rendering any feature as long as it conforms to DAS specifications. Configuration and addition of sources to myKaryoView can be done through the interface. Here we show a proof of principle of myKaryoView's ability to display personal genomics data with 23andMe genome data sources. The tool is available at: http://mykaryoview.com.  相似文献   

17.
Yuge K  Ikeo K  Gojobori T 《Gene》2007,406(1-2):108-112
With the aim of elucidating the evolutionary process of sexual dimorphism in the brain at the molecular level, we conducted genomic comparisons of a set of genes expressed in a sexually different manner in the mouse brain with all genes from other species of eukaryotes. First, seventeen protein-coding genes whose levels of mRNA expression in the brain differed between male and female mice have been known according to the currently available microarray data, and we designated these genes operationally as "sex-related genes in the mouse brain". Next, we estimated the time when these sex-related genes in the mouse brain emerged in the evolutionary process of eukaryotes by examining the presence or absence of the orthologues in the 26 eukaryotic species whose genome sequences are available. As a result, we found that the ten sex-related genes in the mouse brain emerged after the divergence of urochordates and mammals whereas the other seven sex-related genes in the mouse brain emerged before the divergence of urochordates and mammals. In particular, five sex-related genes out of the ten genes in the mouse brain emerged just before the appearance of bony fish which have phenotypic sexual dimorphism in the brain. Interestingly, three of these five sex-related genes that emerged during this period were classified into the "protein binding" function category. Moreover, all of these three genes were expected to have the functions that are related to cell-cell communications in the brain according to the gene expression patterns and/or functional information of these genes. These findings suggest that the orthologues of the sex-related genes in the mouse brain that emerged just before the divergence of bony fish might have essential roles in the evolution of the sexual dimorphism in the brain forming protein-protein interactions.  相似文献   

18.
19.
20.
Phylogenomics of eukaryotes: impact of missing data on large alignments   总被引:17,自引:0,他引:17  
Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins ( approximately 30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号