首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 12 毫秒
1.
Gramene,a tool for grass genomics   总被引:11,自引:0,他引:11  
Gramene (http://www.gramene.org) is a comparative genome mapping database for grasses and a community resource for rice (Oryza sativa). It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, and publications, with a curated database of rice mutants (genes and alleles), molecular markers, and proteins. Gramene curators read and extract detailed information from published sources, summarize that information in a structured format, and establish links to related objects both inside and outside the database, providing seamless connections between independent sources of information. Genetic, physical, and sequence-based maps of rice serve as the fundamental organizing units and provide a common denominator for moving across species and genera within the grass family. Comparative maps of rice, maize (Zea mays), sorghum (Sorghum bicolor), barley (Hordeum vulgare), wheat (Triticum aestivum), and oat (Avena sativa) are anchored by a set of curated correspondences. In addition to sequence-based mappings found in comparative maps and rice genome displays, Gramene makes extensive use of controlled vocabularies to describe specific biological attributes in ways that permit users to query those domains and make comparisons across taxonomic groups. Proteins are annotated for functional significance using gene ontology terms that have been adopted by numerous model species databases. Genetic variants including phenotypes are annotated using plant ontology terms common to all plants and trait ontology terms that are specific to rice. In this paper, we present a brief overview of the search tools available to the plant research community in Gramene.  相似文献   

2.
Giving access to sequence and annotation data for genome assemblies is important because, while facilitating research, it places both assembly and annotation quality under scrutiny, resulting in improvements to both. Therefore we announce Avianbase, a resource for bird genomics, which provides access to data released by the Avian Phylogenomics Consortium.Access to complete genome sequences provides the first step towards the understanding of the biology of organisms. It is the template that underpins the phenotypic characteristics of individuals and ultimately separates species due to the accumulation and fixation of mutations over evolutionary timescales. In terms of the available genomic datasets for species, birds, as our more distant relatives, have been historically underrepresented. The high cost of sequencing and annotation in the past led to a bias towards accumulating data for species that are either established model organisms or economically significant (that is, chicken, turkey and duck, representing two sister orders within the Galloanseriformes clade from the large and diverse phylogeny of birds). The recent release of genome assemblies and initial predictions of protein-coding genes [1-4] for 44 bird species, including representatives from all major branches of the bird phylogeny, is, therefore, highly significant.One of the major challenges with the release of this number of newly sequenced genomes and the many more to come [5] is how to make these available to the various research communities in a way that supports basic research. Providing access to the sequences and initial annotations in the format of text files will limit the potential usage of the data as they require significant resources, including bioinformatics personnel and computer infrastructure in place to access and mine - for example, searching for genes belonging to certain protein families or searching for orthologous genes. These overheads pose a serious bottleneck that can hinder research and requires concerted action by the relevant research communities.Once genomes are submitted to public databases, genome-wide annotations are frequently generated and released either via the Ensembl project [6] or by the National Center for Biotechnology Information [7] and sequence and annotation are then made visually available online in integrated views via the Ensembl or the University of California Santa Cruz (UCSC) genome browsers [8]. These systems provide search facilities, sequence alignment tools like BLAT/BLAST and various analysis tools to facilitate subsetting and computational retrieval of the data, including UCSC’s Table Browser or Ensembl’s Perl and REST APIs and BioMart system.While these systems have become almost indispensable for research, not all sequenced genomes are annotated and displayed in genome browsers. Full genome annotation remains time consuming and resource intensive: a full evidence-based Ensembl genebuild takes approximately 4 months. Thus, the list of species represented is currently limited and depends on various factors, including the completeness of the assembled genome sequence and the overall demand in the scientific community for the resources, including whether the species is a model organism (for example, human or mouse), economically important (for example, farmed animals) or of specific phylogenetic interest. Many of the recently sequenced bird genomes do not obviously fall within these categories.  相似文献   

3.
WormBase (http://www.wormbase.org/) is a web-accessible central data repository for information about Caenorhabditis elegans and related nematodes. The past two years have seen a significant expansion in the biological scope of WormBase, including the integration of large-scale, genome-wide data sets, the inclusion of genome sequence and gene predictions from related species and active literature curation. This expansion of data has also driven the development and refinement of user interfaces and operability, including a new Genome Browser, new searches and facilities for data access and the inclusion of extensive documentation. These advances have expanded WormBase beyond the obvious target audience of C. elegans researchers, to include researchers wishing to explore problems in functional and comparative genomics within the context of a powerful genetic system.  相似文献   

4.

Background  

Statistical bioinformatics is the study of biological data sets obtained by new micro-technologies by means of proper statistical methods. For a better understanding of environmental adaptations of proteins, orthologous sequences from different habitats may be explored and compared. The main goal of the DeltaProt Toolbox is to provide users with important functionality that is needed for comparative screening and studies of extremophile proteins and protein classes. Visualization of the data sets is also the focus of this article, since visualizations can play a key role in making the various relationships transparent. This application paper is intended to inform the reader of the existence, functionality, and applicability of the toolbox.  相似文献   

5.
Sputnik: a database platform for comparative plant genomics   总被引:10,自引:0,他引:10       下载免费PDF全文
  相似文献   

6.
A high-resolution genetic, physical, and cytological map of the sorghum genome is being assembled using AFLP DNA marker technology, six-dimensional pooling of BAC libraries, cDNA mapping technology, and cytogenetic analysis. Recent advances in sorghum comparative genomics and gene-transfer technology are accelerating the discovery and utilization of valuable sorghum genes and alleles.  相似文献   

7.

Background

The lycophytes are an ancient lineage of vascular plants that diverged from the seed plant lineage about 400 Myr ago. Although the lycophytes occupy an important phylogenetic position for understanding the evolution of plants and their genomes, no genomic resources exist for this group of plants.

Results

Here we describe the construction of a large-insert bacterial artificial chromosome (BAC) library from the lycophyte Selaginella moellendorffii. Based on cell flow cytometry, this species has the smallest genome size among the different lycophytes tested, including Huperzia lucidula, Diphaiastrum digita, Isoetes engelmanii and S. kraussiana. The arrayed BAC library consists of 9126 clones; the average insert size is estimated to be 122 kb. Inserts of chloroplast origin account for 2.3% of the clones. The BAC library contains an estimated ten genome-equivalents based on DNA hybridizations using five single-copy and two duplicated S. moellendorffii genes as probes.

Conclusion

The S. moellenforffii BAC library, the first to be constructed from a lycophyte, will be useful to the scientific community as a resource for comparative plant genomics and evolution.  相似文献   

8.
Liriodendron tulipifera L., a member of Magnoliaceae in the order Magnoliales, has been used extensively as a reference species in studies on plant evolution. However, genomic resources for this tree species are limited. We constructed cDNA libraries from ten different types of tissues: premeiotic flower buds, postmeiotic flower buds, open flowers, developing fruit, terminal buds, leaves, cambium, xylem, roots, and seedlings. EST sequences were generated either by 454 GS FLX or Sanger methods. Assembly of almost 2.4 million sequencing reads from all libraries resulted in 137,923 unigenes (132,905 contigs and 4,599 singletons). About 50% of the unigenes had significant matches to publically available plant protein sequences, representing a wide variety of putative functions. Approximately 30,000 simple sequence repeats were identified. More than 97% of the cell wall formation genes in the Cell Wall Navigator and the MAIZEWALL databases are represented. The cinnamyl alcohol dehydrogenase (CAD) homologs identified in the L. tulipifera EST dataset showed different expression levels in the ten tissue types included in this study. In particular, the LtuCAD1 was found to partially recover the stiffness of the floral stems in the Arabidopsis thaliana CAD4 and CAD5 double mutant plants, of the LtuCAD1 in lignin biosynthesis. L. tulipifera genes have greater sequence similarity to homologs from other woody angiosperm species than to non-woody model plants. This large-scale genomic resour"HistryDatesce will be instrumental for gene discovery, cDNA microarray production, and marker-assisted breeding in L. tulipifera, and strengthen this species' role in comparative studies.  相似文献   

9.
Restauro-G: A Rapid Genome Re-Annotation System for Comparative Genomics   总被引:1,自引:0,他引:1  
of complete genome sequences submitted directly from sequencing projects are diverse in terms of annotation strategies and update frequencies. These inconsistencies make comparative studies difficult. To allow rapid data preparation of a large number of complete genomes, automation and speed are important for genome re-annotation. Here we introduce an open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes. Restauro-G re-annotates a genome by similarity searches utilizing the BLASTLike Alignment Tool, referring to protein databases such as UniProt KB, NCBI nr, NCBI COGs, Pfam, and PSORTb. Re-annotation by Restauro-G achieved over 98% accuracy for most bacterial chromosomes in comparison with the original manually curated annotation of EMBL releases. Restauro-G was developed in the generic bioinformatics workbench G-language Genome Analysis Environment and is distributed at http://restauro-g.iab.keio.ac.jp/ under the GNU General Public License.  相似文献   

10.
The human genome sequence provides a reference point from which we can compare ourselves with other organisms. Interspecies comparison is a powerful tool for inferring function from genomic sequence and could ultimately lead to the discovery of what makes humans unique. To date, most comparative sequencing has focused on pair-wise comparisons between human and a limited number of other vertebrates, such as mouse. Targeted approaches now exist for mapping and sequencing vertebrate bacterial artificial chromosomes (BACs) from numerous species, allowing rapid and detailed molecular and phylogenetic investigation of multi-megabase loci. Such targeted sequencing is complementary to current whole-genome sequencing projects, and would benefit greatly from the creation of BAC libraries from a diverse range of vertebrates.  相似文献   

11.

Background  

We describe a novel application of microarray technology for comparative genomics of bacteria in which libraries of entire genomes rather than the sequence of a single genome or sets of genes are arrayed on the slide and then probed for the presence or absence of specific genes and/or gene alleles.  相似文献   

12.
Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species.  相似文献   

13.
Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species.  相似文献   

14.
In order to understand and interpret phylogenetic and functional relationships between multiple prokaryotic species, qualitative and quantitative data must be correlated and displayed. GECO allows linear visualization of multiple genomes using a client/server based approach by dynamically creating .png- or .pdf-formatted images. It is able to display ortholog relations calculated using BLASTCLUST by color coding ortholog representations. Irregularities on the genomic level can be identified by anomalous G/C composition. Thus, this software will enable researchers to detect horizontally transferred genes, pseudogenes and insertions/deletions in related microbial genomes. AVAILABILITY: http://bioinfo.mikrobio.med.uni-giessen.de/geco2/GecoMainServlet  相似文献   

15.
16.
17.
We developed a novel method for identifying SNPs widely distributed throughout the coding and non-coding regions of a genome. The method uses large-scale parallel pyrosequencing technology in combination with bioinformatics tools. We used this method to generate approximately 23,000 candidate SNPs throughout the Macaca mulatta genome. We estimate that over 60% of the SNPs will be of high frequency and useful for mapping QTLs, genetic management, and studies of individual relatedness, whereas other less frequent SNPs may be useful as population specific markers for ancestry identification. We have created a web resource called MamuSNP to view the SNPs and associated information online. This resource will also be useful for researchers using a wide variety of Macaca species in their research.  相似文献   

18.
The MIPS Rice (Oryza sativa) database (MOsDB; http://mips.gsf.de/proj/rice) provides a comprehensive data collection dedicated to the genome information of rice. Rice (O. sativa L.) is one of the most important food crops for over half the world's population and serves as a major model system in cereal genome research. MOsDB integrates data from two publicly available rice genomic sequences, O. sativa L. ssp. indica and O. sativa L. ssp. japonica. Besides regularly updated rice genome sequence information, MOsDB provides an integrated resource for associated analysis data, e.g. internal and external annotation information as well as a complex characterization of all annotated rice genes. The MOsDB web interface supports various search options and allows browsing the database content. MOsDB is continuously expanding to include an increasing range of data type and the growing amount of information on the rice genome.  相似文献   

19.
A report on 'Genomes 2004: International Conference on the Analysis of Microbial and Other Genomes', Hinxton, UK, 14-17 April 2004.  相似文献   

20.

Background

Comparative evolutionary analysis of whole genomes requires not only accurate annotation of gene space, but also proper annotation of the repetitive fraction which is often the largest component of most if not all genomes larger than 50 kb in size.

Results

Here we present the Rice TE database (RiTE-db) - a genus-wide collection of transposable elements and repeated sequences across 11 diploid species of the genus Oryza and the closely-related out-group Leersia perrieri. The database consists of more than 170,000 entries divided into three main types: (i) a classified and curated set of publicly-available repeated sequences, (ii) a set of consensus assemblies of highly-repetitive sequences obtained from genome sequencing surveys of 12 species; and (iii) a set of full-length TEs, identified and extracted from 12 whole genome assemblies.

Conclusions

This is the first report of a repeat dataset that spans the majority of repeat variability within an entire genus, and one that includes complete elements as well as unassembled repeats. The database allows sequence browsing, downloading, and similarity searches. Because of the strategy adopted, the RiTE-db opens a new path to unprecedented direct comparative studies that span the entire nuclear repeat content of 15 million years of Oryza diversity.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1762-3) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号