首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The only natural mechanism of malaria transmission in sub-Saharan Africa is the mosquito, generally Anopheles gambiae. Blocking malaria parasite transmission by stopping the development of Plasmodium in the insect vector would provide a useful alternative to the current methods of malaria control. Toward this end, it is important to understand the molecular basis of the malaria parasite refractory phenotype in An. gambiae mosquito strains. We have selected and sequenced six bacterial artificial chromosome (BAC) clones from the Pen-1 region that is the major quantitative trait locus involved in Plasmodium encapsulation. The sequence and the annotation of five overlapping BAC clones plus one adjacent, but not contiguous clone, totaling 585kb of genomic sequence from the centromeric end of the Pen-1 region of the PEST strain were compared to that of the genome sequence of the same strain produced by the whole genome shotgun technique. This project identified 23 putative mosquito genes plus putative copies of the retrotransposable elements BEL12 and TRANSIBN1_AG in the six BAC clones. Nineteen of the predicted genes are most similar to their Drosophila melanogaster homologs while one is more closely related to vertebrate genes. Comparison of these new BAC sequences plus previously published BAC sequences to the cognate region of the assembled genome sequence identified three retrotransposons present in one sequence version but not the other. One of these elements, Indy, has not been previously described. These observations provide evidence for the recent active transposition of these elements and demonstrate the plasticity of the Anopheles genome. The BAC sequences strongly support the public whole genome shotgun assembly and automatic annotation while also demonstrating the benefit of complementary genome sequences and of human curation. Importantly, the data demonstrate the differences in the genome sequence of an individual mosquito compared to that of a hypothetical, average genome sequence generated by whole genome shotgun assembly.  相似文献   

2.
Gene and SNP annotation are among the first and most important steps in analyzing a genome. As the number of sequenced genomes continues to grow, a key question is: how does the quality of the assembled sequence affect the annotations? We compared the gene and SNP annotations for two different Bos taurus genome assemblies built from the same data but with significant improvements in the later assembly. The same annotation software was used for annotating both sequences. While some annotation differences are expected even between high-quality assemblies such as these, we found that a staggering 40% of the genes (>9,500) varied significantly between assemblies, due in part to the availability of new gene evidence but primarily to genome mis-assembly events and local sequence variations. For instance, although the later assembly is generally superior, 660 protein coding genes in the earlier assembly are entirely missing from the later genome''s annotation, and approximately 3,600 (15%) of the genes have complex structural differences between the two assemblies. In addition, 12–20% of the predicted proteins in both assemblies have relatively large sequence differences when compared to their RefSeq models, and 6–15% of bovine dbSNP records are unrecoverable in the two assemblies. Our findings highlight the consequences of genome assembly quality on gene and SNP annotation and argue for continued improvements in any draft genome sequence. We also found that tracking a gene between different assemblies of the same genome is surprisingly difficult, due to the numerous changes, both small and large, that occur in some genes. As a side benefit, our analyses helped us identify many specific loci for improvement in the Bos taurus genome assembly.  相似文献   

3.
4.
Bioinformatics challenges of new sequencing technology   总被引:8,自引:0,他引:8  
New DNA sequencing technologies can sequence up to one billion bases in a single day at low cost, putting large-scale sequencing within the reach of many scientists. Many researchers are forging ahead with projects to sequence a range of species using the new technologies. However, these new technologies produce read lengths as short as 35-40 nucleotides, posing challenges for genome assembly and annotation. Here we review the challenges and describe some of the bioinformatics systems that are being proposed to solve them. We specifically address issues arising from using these technologies in assembly projects, both de novo and for resequencing purposes, as well as efforts to improve genome annotation in the fragmented assemblies produced by short read lengths.  相似文献   

5.
The human genome reference assembly is crucial for aligning and analyzing sequence data, and for genome annotation, among other roles. However, the models and analysis assumptions that underlie the current assembly need revising to fully represent human sequence diversity. Improved analysis tools and updated data reporting formats are also required.  相似文献   

6.
The Anopheles gambiae genome sequence has been analyzed to find ATP-binding cassette protein genes based on deduced protein similarity to known family members. A nonredundant collection of 44 putative genes was identified including five genes not detected by the original Anopheles genome project machine annotation. These genes encode at least one member of all the human and Drosophila melanogaster ATP-binding protein subgroups. Like D. melanogaster, A. gambiae has subgroup ABCH genes encoding proteins different from the ABC proteins found in other complex organisms. The largest Anopheles subgroup is the ABCC genes which includes one member that can potentially encode ten different isoforms of the protein by differential splicing. As with Drosophila, the second largest Anopheles group is the ABCG subgroup with 12 genes compared to 15 genes in D. melanogaster, but only 5 genes in the human genome. In contrast, fewer ABCA and ABCB genes were identified in the mosquito genome than in the human or Drosophila genomes. Gene duplication is very evident in the Anopheles ABC genes with two groups of four genes, one group with three genes and three groups with two head to tail duplicated genes. These characteristics argue that the A. gambiae is actively using gene duplication as a mechanism to drive genetic variation in this important gene group.  相似文献   

7.
8.
Methylobacterium sp. strain GXF4 is an isolate from grapevine. Here we present the sequence, assembly, and annotation of its genome, which may shed light on its role as a grapevine xylem inhabitant. To our knowledge, this is the first genome announcement of a plant xylem-associated strain of the genus Methylobacterium.  相似文献   

9.
微生物基因组研究进展   总被引:5,自引:1,他引:5  
本综述了微生物全基因组测序的基本方法,数据收集和组装,序列缺口的填充、全基因组序列注释。同时对微生物基因组的研究现状和重大意义也作了简单概述。  相似文献   

10.
11.
MOTIVATION: Contigs-Assembly and Annotation Tool-Box (CAAT-Box) is a software package developed for the computational part of a genome project where the sequence is obtained by a shotgun strategy. CAAT-Box contains new tools to predict links between contigs by using similarity searches with other whole genome sequences. Most importantly, it allows annotation of a genome to commence during the finishing phase using a gene-oriented strategy. For this purpose, CAAT-Box creates an Individual Protein file (IPF) for each ORF of an assembly. The nucleotide sequence reported in an IPF corresponds to the sequence of the ORF with 500 additional bases before the ORF and 200 bases after. For annotation, additional information like Blast results can be added or linked to the IPFs as well as automatic and/or manual annotations. When a new assembly is performed, CAAT-Box creates new IPFs according to the old IPF panel. CAAT-Box recognizes the modified IPFs which are the only ones used for a new automatic analysis after each assembly. Using this strategy, the user works with a group of IPFs independently of the closure phase progression. The IPFs are accessible by a web server and can therefore be modified and commented by different groups. RESULT: CAAT-Box was used to obtain and to annotate several complete genomes like Listeria monocytogenes or Streptococcus agalactiae. AVAILABILITY: The program may be obtained from the authors and is freely available to non-profit organisations.  相似文献   

12.
HM Gan  TH Chew  YL Tay  SF Lye  A Yahya 《Journal of bacteriology》2012,194(17):4759-4760
Hydrogenophaga sp. strain PBC is an effective degrader of 4-aminobenzenesulfonate isolated from textile wastewater. Here we present the assembly and annotation of its genome, which may provide further insights into its metabolic potential. This is the first announcement of the draft genome sequence of a strain from the genus Hydrogenophaga.  相似文献   

13.
The ascomycetous yeast Wickerhamomyces anomalus (formerly Pichia anomala and Hansenula anomala) exhibits antimicrobial activities and flavoring features that are responsible for its frequent association with food, beverage and feed products. However, limited information on the genetic background of this yeast and its multiple capabilities are currently available. Here, we present the draft genome sequence of the neotype strain W.?anomalus DSM 6766. On the basis of pyrosequencing, a de novo assembly of this strain resulted in a draft genome sequence with a total size of 25.47?Mbp. An automatic annotation using RAPYD generated 11?512 protein-coding sequences. This annotation provided the basis to analyse metabolic capabilities, phylogenetic relationships, as well as biotechnologically important features and yielded novel candidate genes of W.?anomalus DSM 6766 coding for proteins participating in antimicrobial activities.  相似文献   

14.
AphidBase: a database for aphid genomic resources   总被引:1,自引:0,他引:1  
  相似文献   

15.
Marsano RM  Caizzi R 《Gene》2005,357(2):115-121
The advanced status of assembly of the nematoceran Anopheles gambiae genomic sequence allowed us to perform a wide genome analysis to looking at the presence of Long Terminal Repeats (LTRs) in the range of 10 kb by means of the LTR_STRUC tool. More than three hundred sequences were retrieved and 210 were treated as putative complete retrotransposons that were individually analysed with respect to known retrotransposons of A. gambiae and D. melanogaster. The results show that the vast majority of the retrotransposons analysed belong to the Ty3/gypsy class and only 8% to the Ty1/copia class. In addition, phylogenetic analysis allowed us to characterize in more detail the relationship of a large BEL-Pao lineage in which a single family was shown to harbour an additional env gene.  相似文献   

16.
The karyotype of the African malaria mosquito Anopheles gambiae contains two pairs of autosomes and a pair of sex chromosomes. The Y chromosome, constituting approximately 10% of the genome, remains virtually unexplored, despite the recent completion of the A. gambiae genome project. Here we report the identification and characterization of Y chromosome sequences of total length approaching 150 kb. We developed 11 Y-specific PCR markers that consistently yielded male-specific products in specimens from both laboratory colony and natural populations. The markers are characterized by low sequence polymorphism in samples collected across Africa and by presence in more than one copy on the Y. Screening of the A. gambiae BAC library using these markers allowed detection of 90 Y-linked BAC clones. Analysis of the BAC sequences and other Y-derived fragments showed massive accumulation of a few transposable elements. Nevertheless, more complex sequences are apparently present on the Y; these include portions of an approximately 48-kb-long unmapped AAAB01008227 scaffold from the whole genome shotgun assembly. Anopheles Y appears not to harbor any of the genes identified in Drosophila Y. However, experiments suggest that one of the ORFs from the AAAB01008227 scaffold represents a fragment of a gene with male-specific expression.  相似文献   

17.
18.
This article documents the whole genome sequence information of the Indian Zaprionus indianus, a member of the fruit fly family Drosophilidae. The sequences were generated on an Illumina platform and reads and whole genome sequence submitted to NCBI to the SRA and BioProject databases, respectively. This is the first Indian Z. indianus whole genome (draft) submitted to the sequence repository with SRA reads. The details of methodology, assembly statistics and functional annotation are presented in this work.  相似文献   

19.
RATT: Rapid Annotation Transfer Tool   总被引:1,自引:0,他引:1  
Second-generation sequencing technologies have made large-scale sequencing projects commonplace. However, making use of these datasets often requires gene function to be ascribed genome wide. Although tool development has kept pace with the changes in sequence production, for tasks such as mapping, de novo assembly or visualization, genome annotation remains a challenge. We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference. The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny. We demonstrate that a Mycobacterium tuberculosis genome or a single 2.5 Mb chromosome from a malaria parasite can be annotated in less than five minutes with only modest computational resources. RATT is available at http://ratt.sourceforge.net.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号