首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 346 毫秒
1.

Background

Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly.

Results

WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm.

Conclusions

Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes.  相似文献   

2.
3.
Syngenta claims ownership of rice - but will give data away   总被引:1,自引:0,他引:1       下载免费PDF全文
  相似文献   

4.

Background

Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species.

Results

We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes.

Conclusions

This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.  相似文献   

5.
6.
7.
8.

Background

Thellungiella halophila (also known as Thellungiella salsuginea) is a model halophyte with a small plant size, short life cycle, and small genome. It easily undergoes genetic transformation by the floral dipping method used with its close relative, Arabidopsis thaliana. Thellungiella genes exhibit high sequence identity (approximately 90% at the cDNA level) with Arabidopsis genes. Furthermore, Thellungiella not only shows tolerance to extreme salinity stress, but also to chilling, freezing, and ozone stress, supporting the use of Thellungiella as a good genomic resource in studies of abiotic stress tolerance.

Results

We constructed a full-length enriched Thellungiella (Shan Dong ecotype) cDNA library from various tissues and whole plants subjected to environmental stresses, including high salinity, chilling, freezing, and abscisic acid treatment. We randomly selected about 20 000 clones and sequenced them from both ends to obtain a total of 35 171 sequences. CAP3 software was used to assemble the sequences and cluster them into 9569 nonredundant cDNA groups. We named these cDNAs "RTFL" (RIKEN Thellungiella Full-Length) cDNAs. Information on functional domains and Gene Ontology (GO) terms for the RTFL cDNAs were obtained using InterPro. The 8289 genes assigned to InterPro IDs were classified according to the GO terms using Plant GO Slim. Categorical comparison between the whole Arabidopsis genome and Thellungiella genes showing low identity to Arabidopsis genes revealed that the population of Thellungiella transport genes is approximately 1.5 times the size of the corresponding Arabidopsis genes. This suggests that these genes regulate a unique ion transportation system in Thellungiella.

Conclusion

As the number of Thellungiella halophila (Thellungiella salsuginea) expressed sequence tags (ESTs) was 9388 in July 2008, the number of ESTs has increased to approximately four times the original value as a result of this effort. Our sequences will thus contribute to correct future annotation of the Thellungiella genome sequence. The full-length enriched cDNA clones will enable the construction of overexpressing mutant plants by introduction of the cDNAs driven by a constitutive promoter, the complementation of Thellungiella mutants, and the determination of promoter regions in the Thellungiella genome.  相似文献   

9.

Background

The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.

Results

Our finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.

Conclusions

The WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.  相似文献   

10.

Background

It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined.

Results

We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences.

Conclusions

Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.  相似文献   

11.
12.
13.

Background

Dystroglycan (Dg) is a transmembrane protein that is a part of the Dystrophin Glycoprotein Complex (DGC) which connects the extracellular matrix to the actin cytoskeleton. The C-terminal end of Dg contains a number of putative SH3, SH2 and WW domain binding sites. The most C-terminal PPXY motif has been established as a binding site for Dystrophin (Dys) WW-domain. However, our previous studies indicate that both Dystroglycan PPXY motives, WWbsI and WWbsII can bind Dystrophin protein in vitro.

Results

We now find that both WW binding sites are important for maintaining full Dg function in the establishment of oocyte polarity in Drosophila. If either WW binding site is mutated, the Dg protein can still be active. However, simultaneous mutations in both WW binding sites abolish the Dg activities in both overexpression and loss-of-function oocyte polarity assays in vivo. Additionally, sequence comparisons of WW binding sites in 12 species of Drosophila, as well as in humans, reveal a high level of conservation. This preservation throughout evolution supports the idea that both WW binding sites are functionally required.

Conclusion

Based on the obtained results we propose that the presence of the two WW binding sites in Dystroglycan secures the essential interaction between Dg and Dys and might further provide additional regulation for the cytoskeletal interactions of this complex.  相似文献   

14.

Background

In eukaryotic cells, oxidative phosphorylation (OXPHOS) uses the products of both nuclear and mitochondrial genes to generate cellular ATP. Interspecies comparative analysis of these genes, which appear to be under strong functional constraints, may shed light on the evolutionary mechanisms that act on a set of genes correlated by function and subcellular localization of their products.

Results

We have identified and annotated the Drosophila melanogaster, D. pseudoobscura and Anopheles gambiae orthologs of 78 nuclear genes encoding mitochondrial proteins involved in oxidative phosphorylation by a comparative analysis of their genomic sequences and organization. We have also identified 47 genes in these three dipteran species each of which shares significant sequence homology with one of the above-mentioned OXPHOS orthologs, and which are likely to have originated by duplication during evolution. Gene structure and intron length are essentially conserved in the three species, although gain or loss of introns is common in A. gambiae. In most tissues of D. melanogaster and A. gambiae the expression level of the duplicate gene is much lower than that of the original gene, and in D. melanogaster at least, its expression is almost always strongly testis-biased, in contrast to the soma-biased expression of the parent gene.

Conclusions

Quickly achieving an expression pattern different from the parent genes may be required for new OXPHOS gene duplicates to be maintained in the genome. This may be a general evolutionary mechanism for originating phenotypic changes that could lead to species differentiation.  相似文献   

15.
《DNA research》2008,15(6):333-346
A large collection of full-length cDNAs is essential for the correct annotation of genomic sequences and for the functional analysis of genes and their products. We obtained a total of 39 936 soybean cDNA clones (GMFL01 and GMFL02 clone sets) in a full-length-enriched cDNA library which was constructed from soybean plants that were grown under various developmental and environmental conditions. Sequencing from 5′ and 3′ ends of the clones generated 68 661 expressed sequence tags (ESTs). The EST sequences were clustered into 22 674 scaffolds involving 2580 full-length sequences. In addition, we sequenced 4712 full-length cDNAs. After removing overlaps, we obtained 6570 new full-length sequences of soybean cDNAs so far. Our data indicated that 87.7% of the soybean cDNA clones contain complete coding sequences in addition to 5′- and 3′-untranslated regions. All of the obtained data confirmed that our collection of soybean full-length cDNAs covers a wide variety of genes. Comparative analysis between the derived sequences from soybean and Arabidopsis, rice or other legumes data revealed that some specific genes were involved in our collection and a large part of them could be annotated to unknown functions. A large set of soybean full-length cDNA clones reported in this study will serve as a useful resource for gene discovery from soybean and will also aid a precise annotation of the soybean genome.Key words: EST, full-length cDNA, functional annotation, legume, soybean  相似文献   

16.

Background

Infection of plants by pathogens and the subsequent disease development involves substantial changes in the biochemistry and physiology of both partners. Analysis of genes that are expressed during these interactions represents a powerful strategy to obtain insights into the molecular events underlying these changes. We have employed expressed sequence tag (EST) analysis to identify rice genes involved in defense responses against infection by the blast fungus Magnaporthe oryzae and fungal genes involved in infectious growth within the host during a compatible interaction.

Results

A cDNA library was constructed with RNA from rice leaves (Oryza sativa cv. Hwacheong) infected with M. oryzae strain KJ201. To enrich for fungal genes, subtraction library using PCR-based suppression subtractive hybridization was constructed with RNA from infected rice leaves as a tester and that from uninfected rice leaves as the driver. A total of 4,148 clones from two libraries were sequenced to generate 2,302 non-redundant ESTs. Of these, 712 and 1,562 ESTs could be identified to encode fungal and rice genes, respectively. To predict gene function, Gene Ontology (GO) analysis was applied, with 31% and 32% of rice and fungal ESTs being assigned to GO terms, respectively. One hundred uniESTs were found to be specific to fungal infection EST. More than 80 full-length fungal cDNA sequences were used to validate ab initio annotated gene model of M. oryzae genome sequence.

Conclusion

This study shows the power of ESTs to refine genome annotation and functional characterization. Results of this work have advanced our understanding of the molecular mechanisms underpinning fungal-plant interactions and formed the basis for new hypothesis.  相似文献   

17.

Background

Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications.

Results

Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5).

Conclusion

Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms.  相似文献   

18.
19.

Background

Discovering the functions of all genes is a central goal of contemporary biomedical research. Despite considerable effort, we are still far from achieving this goal in any metazoan organism. Collectively, the growing body of high-throughput functional genomics data provides evidence of gene function, but remains difficult to interpret.

Results

We constructed the first network of functional relationships for Drosophila melanogaster by integrating most of the available, comprehensive sets of genetic interaction, protein-protein interaction, and microarray expression data. The complete integrated network covers 85% of the currently known genes, which we refined to a high confidence network that includes 20,000 functional relationships among 5,021 genes. An analysis of the network revealed a remarkable concordance with prior knowledge. Using the network, we were able to infer a set of high-confidence Gene Ontology biological process annotations on 483 of the roughly 5,000 previously unannotated genes. We also show that this approach is a means of inferring annotations on a class of genes that cannot be annotated based solely on sequence similarity. Lastly, we demonstrate the utility of the network through reanalyzing gene expression data to both discover clusters of coregulated genes and compile a list of candidate genes related to specific biological processes.

Conclusions

Here we present the the first genome-wide functional gene network in D. melanogaster. The network enables the exploration, mining, and reanalysis of experimental data, as well as the interpretation of new data. The inferred annotations provide testable hypotheses of previously uncharacterized genes.  相似文献   

20.

Background

Systematic, large-scale RNA interference (RNAi) approaches are very valuable to systematically investigate biological processes in cell culture or in tissues of organisms such as Drosophila. A notorious pitfall of all RNAi technologies are potential false positives caused by unspecific knock-down of genes other than the intended target gene. The ultimate proof for RNAi specificity is a rescue by a construct immune to RNAi, typically originating from a related species.

Methodology/Principal Findings

We show that primary sequence divergence in areas targeted by Drosophila melanogaster RNAi hairpins in five non-melanogaster species is sufficient to identify orthologs for 81% of the genes that are predicted to be RNAi refractory. We use clones from a genomic fosmid library of Drosophila pseudoobscura to demonstrate the rescue of RNAi phenotypes in Drosophila melanogaster muscles. Four out of five fosmid clones we tested harbour cross-species functionality for the gene assayed, and three out of the four rescue a RNAi phenotype in Drosophila melanogaster.

Conclusions/Significance

The Drosophila pseudoobscura fosmid library is designed for seamless cross-species transgenesis and can be readily used to demonstrate specificity of RNAi phenotypes in a systematic manner.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号