期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analysis of 14 BAC sequences from the Aedes aegypti genome: a benchmark for genome annotation and assembly

Lobo NF Campbell KS Thaner D Debruyn B Koo H Gelbart WM Loftus BJ Severson DW Collins FH 《Genome biology》2007,8(5):R88-12

相似文献

2.

Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

下载免费PDF全文

Bergman CM Pfeiffer BD Rincón-Limas DE Hoskins RA Gnirke A Mungall CJ Wang AM Kronmiller B Pacleb J Park S Stapleton M Wan K George RA de Jong PJ Botas J Rubin GM Celniker SE 《Genome biology》2002,3(12):research0086.1-862

Background

It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined.

Results

We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences.

Conclusions

Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone. 相似文献

3.

Systematic determination of patterns of gene expression during Drosophila embryogenesis

下载免费PDF全文

Tomancak P Beaton A Weiszmann R Kwan E Shu S Lewis SE Richards S Ashburner M Hartenstein V Celniker SE Rubin GM 《Genome biology》2002,3(12):research0088.1-8814

相似文献

4.

ITEP: An integrated toolkit for exploration of microbial pan-genomes

Matthew N Benedict James R Henriksen William W Metcalf Rachel J Whitaker Nathan D Price 《BMC genomics》2014,15(1):1-11

相似文献

5.

An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome

下载免费PDF全文

Hild M Beckmann B Haas SA Koch B Solovyev V Busold C Fellenberg K Boutros M Vingron M Sauer F Hoheisel JD Paro R 《Genome biology》2003,5(1):R3-17

相似文献

6.

Heterochromatic sequences in a Drosophila whole-genome shotgun assembly

下载免费PDF全文

Hoskins RA Smith CD Carlson JW Carvalho AB Halpern A Kaminker JS Kennedy C Mungall CJ Sullivan BA Sutton GG Yasuhara JC Wakimoto BT Myers EW Celniker SE Rubin GM Karpen GH 《Genome biology》2002,3(12):research0085.1-8516

Background

Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly.

Results

WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm.

Conclusions

Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes. 相似文献

7.

Two novel missense mutations in the myelin protein zero gene causes Charcot-Marie-Tooth type 2 and Déjérine-Sottas syndrome

Geir J Braathen Jette C Sand Michael B Russell 《BMC research notes》2010,3(1):1-5

Background

Gene-list annotations are critical for researchers to explore the complex relationships between genes and functionalities. Currently, the annotations of a gene list are usually summarized by a table or a barplot. As such, potentially biologically important complexities such as one gene belonging to multiple annotation categories are difficult to extract. We have devised explicit and efficient visualization methods that provide intuitive methods for interrogating the intrinsic connections between biological categories and genes.

Findings

We have constructed a data model and now present two novel methods in a Bioconductor package, "GeneAnswers", to simultaneously visualize genes, concepts (a.k.a. annotation categories), and concept-gene connections (a.k.a. annotations): the "Concept-and-Gene Network" and the "Concept-and-Gene Cross Tabulation". These methods have been tested and validated with microarray-derived gene lists.

Conclusions

These new visualization methods can effectively present annotations using Gene Ontology, Disease Ontology, or any other user-defined gene annotations that have been pre-associated with an organism's genome by human curation, automated pipelines, or a combination of the two. The gene-annotation data model and associated methods are available in the Bioconductor package called "GeneAnswers " described in this publication. 相似文献

8.

Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

Tetsuya Sakurai Germán Plata Fausto Rodríguez-Zapata Motoaki Seki Andrés Salcedo Atsushi Toyoda Atsushi Ishiwata Joe Tohme Yoshiyuki Sakaki Kazuo Shinozaki Manabu Ishitani 《BMC plant biology》2007,7(1):1-17

相似文献

9.

Towards precise classification of cancers based on robust gene functional expression profiles

Zheng Guo Tianwen Zhang Xia Li Qi Wang Jianzhen Xu Hui Yu Jing Zhu Haiyun Wang Chenguang Wang Eric J Topol Qing Wang Shaoqi Rao 《BMC bioinformatics》2005,6(1):1-12

Background

Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method.

Results

We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end.

Conclusions

De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods. 相似文献

10.

A Drosophila full-length cDNA resource

下载免费PDF全文

Stapleton M Carlson J Brokstein P Yu C Champe M George R Guarin H Kronmiller B Pacleb J Park S Wan K Rubin GM Celniker SE 《Genome biology》2002,3(12):research0080.1-808

相似文献

11.

Genome-wide deletion mutant analysis reveals genes required for respiratory growth, mitochondrial genome maintenance and mitochondrial protein synthesis in Saccharomyces cerevisiae

Sandra Merz Benedikt Westermann 《Genome biology》2009,10(9):1-18

相似文献

12.

FragIdent - Automatic identification and characterisation of cDNA-fragments

Dominik Seelow Heike Goehler Katrin Hoffmann 《BMC genomics》2009,10(1):1-6

Background

The ubiquitin 26S/proteasome system (UPS), a serial cascade process of protein ubiquitination and degradation, is the last step for most cellular proteins. There are many genes involved in this system, but are not identified in many species. The accumulating availability of genomic sequence data is generating more demands in data management and analysis. Genomics data of plants such as Populus trichocarpa, Medicago truncatula, Glycine max and others are now publicly accessible. It is time to integrate information on classes of genes for complex protein systems such as UPS.

Results

We developed a database of higher plants' UPS, named 'plantsUPS'. Both automated search and manual curation were performed in identifying candidate genes. Extensive annotations referring to each gene were generated, including basic gene characterization, protein features, GO (gene ontology) assignment, microarray probe set annotation and expression data, as well as cross-links among different organisms. A chromosome distribution map, multi-sequence alignment, and phylogenetic trees for each species or gene family were also created. A user-friendly web interface and regular updates make plantsUPS valuable to researchers in related fields.

Conclusion

The plantsUPS enables the exploration and comparative analysis of UPS in higher plants. It now archives > 8000 genes from seven plant species distributed in 11 UPS-involved gene families. The plantsUPS is freely available now to all users at http://bioinformatics.cau.edu.cn/plantsUPS. 相似文献

13.

A comprehensive transcript index of the human genome generated using microarrays and computational approaches

下载免费PDF全文

Schadt EE Edwards SW GuhaThakurta D Holder D Ying L Svetnik V Leonardson A Hart KW Russell A Li G Cavet G Castle J McDonagh P Kan Z Chen R Kasarskis A Margarint M Caceres RM Johnson JM Armour CD Garrett-Engele PW Tsinoremas NF Shoemaker DD 《Genome biology》2004,5(10):R73-17

相似文献

14.

Enrichment of Triticum aestivum gene annotations using ortholog cliques and gene ontologies in other plants

Dan Tulpan Serge Leger Alain Tchagang Youlian Pan 《BMC genomics》2015,16(1)

Background

While the gargantuan multi-nation effort of sequencing T. aestivum gets close to completion, the annotation process for the vast number of wheat genes and proteins is in its infancy. Previous experimental studies carried out on model plant organisms such as A. thaliana and O. sativa provide a plethora of gene annotations that can be used as potential starting points for wheat gene annotations, proven that solid cross-species gene-to-gene and protein-to-protein correspondences are provided.

Results

DNA and protein sequences and corresponding annotations for T. aestivum and 9 other plant species were collected from Ensembl Plants release 22 and curated. Cliques of predicted 1-to-1 orthologs were identified and an annotation enrichment model was defined based on existing gene-GO term associations and phylogenetic relationships among wheat and 9 other plant species. A total of 13 cliques of size 10 were identified, which represent putative functionally equivalent genes and proteins in the 10 plant species. Eighty-five new and more specific GO terms were associated with wheat genes in the 13 cliques of size 10, which represent a 65% increase compared with the previously 130 known GO terms. Similar expression patterns for 4 genes from Arabidopsis, barley, maize and rice in cliques of size 10 provide experimental evidence to support our model. Overall, based on clique size equal or larger than 3, our model enriched the existing gene-GO term associations for 7,838 (8%) wheat genes, of which 2,139 had no previous annotation.

Conclusions

Our novel comparative genomics approach enriches existing T. aestivum gene annotations based on cliques of predicted 1-to-1 orthologs, phylogenetic relationships and existing gene ontologies from 9 other plant species.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1496-2) contains supplementary material, which is available to authorized users. 相似文献

15.

Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release

Brian J Haas Jennifer R Wortman Catherine M Ronning Linda I Hannick Roger K Smith Jr Rama Maiti Agnes P Chan Chunhui Yu Maryam Farzad Dongying Wu Owen White Christopher D Town 《BMC biology》2005,3(1):1-19

Background

Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications.

Results

Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5).

Conclusion

Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms. 相似文献

16.

EGASP: the human ENCODE Genome Annotation Assessment Project

Guigó R Flicek P Abril JF Reymond A Lagarde J Denoeud F Antonarakis S Ashburner M Bajic VB Birney E Castelo R Eyras E Ucla C Gingeras TR Harrow J Hubbard T Lewis SE Reese MG 《Genome biology》2006,7(Z1):S2.1-S231

相似文献

17.

Digital expression profiling of novel diatom transcripts provides insight into their biological functions 总被引：1，自引：0，他引：1

Uma Maheswari Kamel Jabbari Jean-Louis Petit Betina M Porcel Andrew E Allen Jean-Paul Cadoret Alessandra De Martino Marc Heijde Raymond Kaas Julie La Roche Pascal J Lopez Véronique Martin-Jézéquel Agnès Meichenin Thomas Mock Micaela Schnitzler Parker Assaf Vardi E Virginia Armbrust Jean Weissenbach Michaël Katinka Chris Bowler 《Genome biology》2010,11(8):1-19

相似文献

18.

CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts

Alison C Testa James K Hane Simon R Ellwood Richard P Oliver 《BMC genomics》2015,16(1)

相似文献

19.

The Yak genome database: an integrative database for studying yak biology and high-altitude adaption 总被引：1，自引：0，他引：1

Quanjun Hu Tao Ma Kun Wang Ting Xu Jianquan Liu Qiang Qiu 《BMC genomics》2012,13(1):1-5

相似文献

20.

A BAC-based integrated linkage map of the silkworm Bombyx mori 总被引：3，自引：0，他引：3

Yamamoto K Nohata J Kadono-Okuda K Narukawa J Sasanuma M Sasanuma S Minami H Shimomura M Suetsugu Y Banno Y Osoegawa K de Jong PJ Goldsmith MR Mita K 《Genome biology》2008,9(1):R21-14

Background

In 2004, draft sequences of the model lepidopteran Bombyx mori were reported using whole-genome shotgun sequencing. Because of relatively shallow genome coverage, the silkworm genome remains fragmented, hampering annotation and comparative genome studies. For a more complete genome analysis, we developed extended scaffolds combining physical maps with improved genetic maps.

Results

We mapped 1,755 single nucleotide polymorphism (SNP) markers from bacterial artificial chromosome (BAC) end sequences onto 28 linkage groups using a recombining male backcross population, yielding an average inter-SNP distance of 0.81 cM (about 270 kilobases). We constructed 6,221 contigs by fingerprinting clones from three BAC libraries digested with different restriction enzymes, and assigned a total of 724 single copy genes to them by BLAST (basic local alignment search tool) search of the BAC end sequences and high-density BAC filter hybridization using expressed sequence tags as probes. We assigned 964 additional expressed sequence tags to linkage groups by restriction fragment length polymorphism analysis of a nonrecombining female backcross population. Altogether, 361.1 megabases of BAC contigs and singletons were integrated with a map containing 1,688 independent genes. A test of synteny using Oxford grid analysis with more than 500 silkworm genes revealed six versus 20 silkworm linkage groups containing eight or more orthologs of Apis versus Tribolium, respectively.

Conclusion

The integrated map contains approximately 10% of predicted silkworm genes and has an estimated 76% genome coverage by BACs. This provides a new resource for improved assembly of whole-genome shotgun data, gene annotation and positional cloning, and will serve as a platform for comparative genomics and gene discovery in Lepidoptera and other insects. 相似文献