首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Proteomic analysis of the EhV-86 virion   总被引:1,自引:0,他引:1  

Background

Emiliania huxleyi virus 86 (EhV-86) is the type species of the genus Coccolithovirus within the family Phycodnaviridae. The fully sequenced 407,339 bp genome is predicted to encode 473 protein coding sequences (CDSs) and is the largest Phycodnaviridae sequenced to date. The majority of EhV-86 CDSs exhibit no similarity to proteins in the public databases.

Results

Proteomic analysis by 1-DE and then LC-MS/MS determined that the virion of EhV-86 is composed of at least 28 proteins, 23 of which are predicted to be membrane proteins. Besides the major capsid protein, putative function can be assigned to 4 other components of the virion: two lectin proteins, a thioredoxin and a serine/threonine protein kinase.

Conclusion

This study represents the first steps toward the identification of the protein components that make up the EhV-86 virion. Aside from the major capsid protein, whose function in the virion is well known and defined, the nature of the other proteins suggest roles involved with viral budding, caspase activation, signalling, anti-oxidation, virus adsorption and host range determination.  相似文献   

2.

Background

Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector.

Results

We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download.

Conclusion

Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms.  相似文献   

3.

Background

Laribacter hongkongensis is associated with community-acquired gastroenteritis and traveler's diarrhea. In this study, we performed an in-depth annotation of the genes in its genome related to the various steps in the infective process, drug resistance and mobile genetic elements.

Results

For acid and bile resistance, L. hongkongensis possessed a urease gene cassette, two arc gene clusters and bile salt efflux systems. For intestinal colonization, it possessed a putative adhesin of the autotransporter family homologous to those of diffusely adherent Escherichia coli (E. coli) and enterotoxigenic E. coli. To evade from host defense, it possessed superoxide dismutase and catalases. For lipopolysaccharide biosynthesis, it possessed the same set of genes that encode enzymes for synthesizing lipid A, two Kdo units and heptose units as E. coli, but different genes for its symmetrical acylation pattern, and nine genes for polysaccharide side chains biosynthesis. It contained a number of CDSs that encode putative cell surface acting (RTX toxin and hemolysins) and intracellular cytotoxins (patatin-like proteins) and enzymes for invasion (outer membrane phospholipase A). It contained a broad variety of antibiotic resistance-related genes, including genes related to β-lactam (n = 10) and multidrug efflux (n = 54). It also contained eight prophages, 17 other phage-related CDSs and 26 CDSs for transposases.

Conclusions

The L. hongkongensis genome possessed genes for acid and bile resistance, intestinal mucosa colonization, evasion of host defense and cytotoxicity and invasion. A broad variety of antibiotic resistance or multidrug resistance genes, a high number of prophages, other phage-related CDSs and CDSs for transposases, were also identified.  相似文献   

4.

Background

Xanthomonas campestris pathovar campestris (Xcc) is the causal agent of black rot disease of crucifers worldwide. The molecular genetic diversity and host specificity of Xcc are poorly understood.

Results

We constructed a microarray based on the complete genome sequence of Xcc strain 8004 and investigated the genetic diversity and host specificity of Xcc by array-based comparative genome hybridization analyses of 18 virulent strains. The results demonstrate that a genetic core comprising 3,405 of the 4,186 coding sequences (CDSs) spotted on the array are conserved and a flexible gene pool with 730 CDSs is absent/highly divergent (AHD). The results also revealed that 258 of the 304 proved/presumed pathogenicity genes are conserved and 46 are AHD. The conserved pathogenicity genes include mainly the genes involved in type I, II and III secretion systems, the quorum sensing system, extracellular enzymes and polysaccharide production, as well as many other proved pathogenicity genes, while the AHD CDSs contain the genes encoding type IV secretion system (T4SS) and type III-effectors. A Xcc T4SS-deletion mutant displayed the same virulence as wild type. Furthermore, three avirulence genes (avrXccC, avrXccE1 and avrBs1) were identified. avrXccC and avrXccE1 conferred avirulence on the hosts mustard cultivar Guangtou and Chinese cabbage cultivar Zhongbai-83, respectively, and avrBs1 conferred hypersensitive response on the nonhost pepper ECW10R.

Conclusion

About 80% of the Xcc CDSs, including 258 proved/presumed pathogenicity genes, is conserved in different strains. Xcc T4SS is not involved in pathogenicity. An efficient strategy to identify avr genes determining host specificity from the AHD genes was developed.  相似文献   

5.
Endophytes are related with health and growth of plants. In this study, the endophytic bacterial strain RSE1 was isolated from seeds of super hybrid rice Shenliangyou 5814 (Oryza sativa L.,). Strain RSE1 was identified as Paenibacillus polymyxa by polyphasic taxonomy identification. Through the antagonistic test with the pathogenic strain of rice false smut, Ustilaginoidea oryzae CICC 2710, it showed that the strain RSE1 had an effective antagonistic activity against this pathogen. The draft genome of strain RSE1 was sequenced by Illumina HiSeq 2000, and 3 CDSs for glucanase gene were annotated and correlated to antagonistic activity. Using specific primers to amplify the biocontrol gene in glucanase family, β-1,3-1,4-glucanase gene (gluB) was found. This study laid a scientific foundation for developing and utilizing biological bio-control bacteria agent preventing the suffering from rice false smut.  相似文献   

6.
Wang X  Chen W  Huang Y  Sun J  Men J  Liu H  Luo F  Guo L  Lv X  Deng C  Zhou C  Fan Y  Li X  Huang L  Hu Y  Liang C  Hu X  Xu J  Yu X 《Genome biology》2011,12(10):R107-14

Background

Clonorchis sinensis is a carcinogenic human liver fluke that is widespread in Asian countries. Increasing infection rates of this neglected tropical disease are leading to negative economic and public health consequences in affected regions. Experimental and epidemiological studies have shown a strong association between the incidence of cholangiocarcinoma and the infection rate of C. sinensis. To aid research into this organism, we have sequenced its genome.

Results

We combined de novo sequencing with computational techniques to provide new information about the biology of this liver fluke. The assembled genome has a total size of 516 Mb with a scaffold N50 length of 42 kb. Approximately 16,000 reliable protein-coding gene models were predicted. Genes for the complete pathways for glycolysis, the Krebs cycle and fatty acid metabolism were found, but key genes involved in fatty acid biosynthesis are missing from the genome, reflecting the parasitic lifestyle of a liver fluke that receives lipids from the bile of its host. We also identified pathogenic molecules that may contribute to liver fluke-induced hepatobiliary diseases. Large proteins such as multifunctional secreted proteases and tegumental proteins were identified as potential targets for the development of drugs and vaccines.

Conclusions

This study provides valuable genomic information about the human liver fluke C. sinensis and adds to our knowledge on the biology of the parasite. The draft genome will serve as a platform to develop new strategies for parasite control.  相似文献   

7.

Background

The FVB/NJ mouse strain has its origins in a colony of outbred Swiss mice established in 1935 at the National Institutes of Health. Mice derived from this source were selectively bred for sensitivity to histamine diphosphate and the B strain of Friend leukemia virus. This led to the establishment of the FVB/N inbred strain, which was subsequently imported to the Jackson Laboratory and designated FVB/NJ. The FVB/NJ mouse has several distinct characteristics, such as large pronuclear morphology, vigorous reproductive performance, and consistently large litters that make it highly desirable for transgenic strain production and general purpose use.

Results

Using next-generation sequencing technology, we have sequenced the genome of FVB/NJ to approximately 50-fold coverage, and have generated a comprehensive catalog of single nucleotide polymorphisms, small insertion/deletion polymorphisms, and structural variants, relative to the reference C57BL/6J genome. We have examined a previously identified quantitative trait locus for atherosclerosis susceptibility on chromosome 10 and identify several previously unknown candidate causal variants.

Conclusion

The sequencing of the FVB/NJ genome and generation of this catalog has increased the number of known variant sites in FVB/NJ by a factor of four, and will help accelerate the identification of the precise molecular variants that are responsible for phenotypes observed in this widely used strain.  相似文献   

8.
A whole-genome assembly of the domestic cow, Bos taurus   总被引:4,自引:0,他引:4  

Background

The genome of the domestic cow, Bos taurus, was sequenced using a mixture of hierarchical and whole-genome shotgun sequencing methods.

Results

We have assembled the 35 million sequence reads and applied a variety of assembly improvement techniques, creating an assembly of 2.86 billion base pairs that has multiple improvements over previous assemblies: it is more complete, covering more of the genome; thousands of gaps have been closed; many erroneous inversions, deletions, and translocations have been corrected; and thousands of single-nucleotide errors have been corrected. Our evaluation using independent metrics demonstrates that the resulting assembly is substantially more accurate and complete than alternative versions.

Conclusions

By using independent mapping data and conserved synteny between the cow and human genomes, we were able to construct an assembly with excellent large-scale contiguity in which a large majority (approximately 91%) of the genome has been placed onto the 30 B. taurus chromosomes. We constructed a new cow-human synteny map that expands upon previous maps. We also identified for the first time a portion of the B. taurus Y chromosome.  相似文献   

9.
10.

Background

Cryptococcus neoformans, a basidiomycetous fungus of universal occurrence, is a significant opportunistic human pathogen causing meningitis. Owing to an increase in the number of immunosuppressed individuals along with emergence of drug-resistant strains, C. neoformans is gaining importance as a pathogen. Although, whole genome sequencing of three varieties of C. neoformans has been completed recently, no global proteomic studies have yet been reported.

Results

We performed a comprehensive proteomic analysis of C. neoformans var. grubii (Serotype A), which is the most virulent variety, in order to provide protein-level evidence for computationally predicted gene models and to refine the existing annotations. We confirmed the protein-coding potential of 3,674 genes from a total of 6,980 predicted protein-coding genes. We also identified 4 novel genes and corrected 104 predicted gene models. In addition, our studies led to the correction of translational start site, splice junctions and reading frame used for translation in a number of proteins. Finally, we validated a subset of our novel findings by RT-PCR and sequencing.

Conclusions

Proteogenomic investigation described here facilitated the validation and refinement of computationally derived gene models in the intron-rich genome of C. neoformans, an important fungal pathogen in humans.  相似文献   

11.

Background

Pantoea ananatis is found in a wide range of natural environments, including water, soil, as part of the epi- and endophytic flora of various plant hosts, and in the insect gut. Some strains have proven effective as biological control agents and plant-growth promoters, while other strains have been implicated in diseases of a broad range of plant hosts and humans. By analysing the pan-genome of eight sequenced P. ananatis strains isolated from different sources we identified factors potentially underlying its ability to colonize and interact with hosts in both the plant and animal Kingdoms.

Results

The pan-genome of the eight compared P. ananatis strains consisted of a core genome comprised of 3,876 protein coding sequences (CDSs) and a sizeable accessory genome consisting of 1,690 CDSs. We estimate that ~106 unique CDSs would be added to the pan-genome with each additional P. ananatis genome sequenced in the future. The accessory fraction is derived mainly from integrated prophages and codes mostly for proteins of unknown function. Comparison of the translated CDSs on the P. ananatis pan-genome with the proteins encoded on all sequenced bacterial genomes currently available revealed that P. ananatis carries a number of CDSs with orthologs restricted to bacteria associated with distinct hosts, namely plant-, animal- and insect-associated bacteria. These CDSs encode proteins with putative roles in transport and metabolism of carbohydrate and amino acid substrates, adherence to host tissues, protection against plant and animal defense mechanisms and the biosynthesis of potential pathogenicity determinants including insecticidal peptides, phytotoxins and type VI secretion system effectors.

Conclusions

P. ananatis has an ‘open’ pan-genome typical of bacterial species that colonize several different environments. The pan-genome incorporates a large number of genes encoding proteins that may enable P. ananatis to colonize, persist in and potentially cause disease symptoms in a wide range of plant and animal hosts.

Electronic supplementary material

The online version of this article (doi: 10.1186/1471-2164-15-404) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

The Mongolian gerbils are a good model to mimic the Helicobacter pylori -associated pathogenesis of the human stomach. In the current study the gerbil-adapted strain B8 was completely sequenced, annotated and compared to previous genomes, including the 73 supercontigs of the parental strain B128.

Results

The complete genome of H. pylori B8 was manually curated gene by gene, to assign as much function as possible. It consists of a circular chromosome of 1,673,997 bp and of a small plasmid of 6,032 bp carrying nine putative genes. The chromosome contains 1,711 coding sequences, 293 of which are strain-specific, coding mainly for hypothetical proteins, and a large plasticity zone containing a putative type-IV-secretion system and coding sequences with unknown function. The cag -pathogenicity island is rearranged such that the cag A-gene is located 13,730 bp downstream of the inverted gene cluster cag B- cag 1. Directly adjacent to the cag A-gene, there are four hypothetical genes and one variable gene with a different codon usage compared to the rest of the H. pylori B8-genome. This indicates that these coding sequences might be acquired via horizontal gene transfer. The genome comparison of strain B8 to its parental strain B128 delivers 425 unique B8-proteins. Due to the fact that strain B128 was not fully sequenced and only automatically annotated, only 12 of these proteins are definitive singletons that might have been acquired during the gerbil-adaptation process of strain B128.

Conclusion

Our sequence data and its analysis provide new insight into the high genetic diversity of H. pylori -strains. We have shown that the gerbil-adapted strain B8 has the potential to build, possibly by a high rate of mutation and recombination, a dynamic pool of genetic variants (e.g. fragmented genes and repetitive regions) required for the adaptation-processes. We hypothesize that these variants are essential for the colonization and persistence of strain B8 in the gerbil stomach during inflammation.  相似文献   

13.
Erwinia amylovora causes the economically important disease fire blight that affects rosaceous plants, especially pear and apple. Here we report the complete genome sequence and annotation of strain ATCC 49946. The analysis of the sequence and its comparison with sequenced genomes of closely related enterobacteria revealed signs of pathoadaptation to rosaceous hosts.Erwinia amylovora, a plant-associated member of the Enterobacteriaceae, causes fire blight, a devastating disease of rosaceous plants, especially pear and apple (6). The complete genome of Ea273 (ATCC 49946), a virulent strain isolated from an infected apple tree in New York State, was sequenced. Total DNA was extracted and prepared in pMAQ1 shotgun libraries. The complete shotgun sequence was obtained by using dye terminator chemistry in ABI 3730 automated sequencers and contains 88,457 reads (11.12-fold coverage), yielding a theoretical coverage of the genome of 99.99%. The sequence was assembled, finished, and annotated as described previously (1, 5), using Artemis (4) to collate data and facilitate annotation.The genome of E. amylovora consists of a circular chromosome of 3,805,874 bp and two plasmids, AMYP1 (28,243 bp) and AMYP2 (71,487 bp). Coding regions in the chromosome account for 85.1% of the total sequence, with 3,483 identified coding sequences (CDS). Two hundred fifty-four (7%) of the CDSs do not have any matches in current NCBI databases; 114 (3.3%) correspond to conserved hypothetical proteins. Forty-nine CDSs (1.4%) are similar to genes from mobile elements such as integrases, transposases, and bacteriophages, and 110 CDSs (3.2%) were classified as pseudogenes due to interruptions or truncations of the CDSs. The remaining 2,956 annotated CDSs include among other categories genes involved in biosynthesis of the cellular envelope and modifications of surface proteins (299 CDSs [11%]) and genes involved in signal transduction and regulation (228 CDSs [8%]). Seven rRNA operons and 78 tRNA sequences were identified in the chromosome; two new clusters were identified (AMY1550-1575 and AMY2648-2676) that resemble the T3SS-encoding SSR-1 island of Sodalis glossinidius (2), and four clusters that contain genes for biosynthesis of flagella, which based on their location might be regulated independently.The smaller plasmid, AMYP1, had been reported as pEA29 (3); its sequence is nearly identical to the one reported here. The larger plasmid, AMYP2, renamed pEA72 for consistency in nomenclature, contains 87 predicted CDSs, with two predicted mobile-element-related CDSs and one pseudogene. Among the CDSs with annotated functions are a cluster of genes (AMYP2_49 to AMYP2_62) that encode a putative type IV fimbrial system (pil genes).The genome of E. amylovora is only 3.8 Mb long, whereas most free-living enterobacteria, including plant pathogens, have genomes of 4.5 Mb to 5.5 Mb. Comparison of the genome of Ea273 with the sequenced genomes of 15 closely related enterobacteria identified 21 lineage-specific regions, which might be considered genomic islands. E. amylovora has many more predicted pseudogenes, relative to other enterobacteria with similar lifestyles. Given its size and the preponderance of pseudogenes, genome reduction may have occurred via mutational inactivation and subsequent deletion with the following consequences: E. amylovora has fewer genes involved in anaerobic respiration and fermentation than are found in typical related enterobacteria; this likely result in a reduced capacity to live in anaerobic environments.The genome sequence of E. amylovora has revealed clear signs of pathoadaptation to the rosaceous plant environment. For example, T3SS-related proteins are present that are more similar to proteins of other plant pathogens than to proteins of closely related enterobacteria. These include type III effectors, homologous to those of plant-pathogenic pseudomonads, which confer virulence to E. amylovora in plants, and a sorbitol-metabolizing cluster that may confer a competitive advantage for survival in rosaceous plants. The reduced genome size and erosion or loss of genes involved in anaerobic respiration and nitrate assimilation are remarkable, relative to other plant- and animal-pathogenic members of the Enterobacteriaceae.  相似文献   

14.

Background

Proteolytic Clostridium botulinum is the causative agent of botulism, a severe neuroparalytic illness. Given the severity of botulism, surprisingly little is known of the population structure, biology, phylogeny or evolution of C. botulinum. The recent determination of the genome sequence of C. botulinum has allowed comparative genomic indexing using a DNA microarray.

Results

Whole genome microarray analysis revealed that 63% of the coding sequences (CDSs) present in reference strain ATCC 3502 were common to all 61 widely-representative strains of proteolytic C. botulinum and the closely related C. sporogenes tested. This indicates a relatively stable genome. There was, however, evidence for recombination and genetic exchange, in particular within the neurotoxin gene and cluster (including transfer of neurotoxin genes to C. sporogenes), and the flagellar glycosylation island (FGI). These two loci appear to have evolved independently from each other, and from the remainder of the genetic complement. A number of strains were atypical; for example, while 10 out of 14 strains that formed type A1 toxin gave almost identical profiles in whole genome, neurotoxin cluster and FGI analyses, the other four strains showed divergent properties. Furthermore, a new neurotoxin sub-type (A5) has been discovered in strains from heroin-associated wound botulism cases. For the first time, differences in glycosylation profiles of the flagella could be linked to differences in the gene content of the FGI.

Conclusion

Proteolytic C. botulinum has a stable genome backbone containing specific regions of genetic heterogeneity. These include the neurotoxin gene cluster and the FGI, each having evolved independently of each other and the remainder of the genetic complement. Analysis of these genetic components provides a high degree of discrimination of strains of proteolytic C. botulinum, and is suitable for clinical and forensic investigations of botulism outbreaks.  相似文献   

15.

Background

Acinetobacter baumannii is an important nosocomial pathogen that can develop multidrug resistance. In this study, we characterized the genome of the A. baumannii strain DMS06669 (isolated from the sputum of a male patient with hospital-acquired pneumonia) and focused on identification of genes relevant to antibiotic resistance.

Methods

Whole genome analysis of A. baumannii DMS06669 from hospital-acquired pneumonia patients included de novo assembly; gene prediction; functional annotation to public databases; phylogenetics tree construction and antibiotics genes identification.

Results

After sequencing the A. baumannii DMS06669 genome and performing quality control, de novo genome assembly was carried out, producing 24 scaffolds. Public databases were used for gene prediction and functional annotation to construct a phylogenetic tree of the DMS06669 strain with 21 other A. baumannii strains. A total of 18 possible antibiotic resistance genes, conferring resistance to eight distinct classes of antibiotics, were identified. Eight of these genes have not previously been reported to occur in A. baumannii.

Conclusions

Our results provide important information regarding mechanisms that may contribute to antibiotic resistance in the DMS06669 strain, and have implications for treatment of patients infected with A. baumannii.
  相似文献   

16.

Background

Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome.

Results

Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella.

Conclusions

When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution.  相似文献   

17.
The gram-negative anaerobic bacterium Porphyromonas gingivalis is a major causative agent of chronic periodontitis. Porphyromonas gingivalis strains have been classified into virulent and less-virulent strains by mouse subcutaneous soft tissue abscess model analysis. Here, we present the whole genome sequence of P. gingivalis ATCC 33277, which is classified as a less-virulent strain. We identified 2090 protein-coding sequences (CDSs), 4 RNA operons, and 53 tRNA genes in the ATCC 33277 genome. By genomic comparison with the virulent strain W83, we identified 461 ATCC 33277-specific and 415 W83-specific CDSs. Extensive genomic rearrangements were observed between the two strains: 175 regions in which genomic rearrangements have occurred were identified. Thirty-five of those genomic rearrangements were inversion or translocation and 140 were simple insertion, deletion, or replacement. Both strains contained large numbers of mobile elements, such as insertion sequences, miniature inverted-repeat transposable elements (MITEs), and conjugative transposons, which are frequently associated with genomic rearrangements. These findings indicate that the mobile genetic elements have been deeply involved in the extensive genome rearrangement of P. gingivalis and the occurrence of many of the strain-specific CDSs. We also describe here a very unique feature of MITE400, which we renamed MITEPgRS (MITE of P. gingivalis with Repeating Sequences).Key words: Porphyromonas gingivalis, whole genome sequence, genome rearrangement, conjugative transposon, MITE  相似文献   

18.

Background

Knowledge of the origins, distribution, and inheritance of variation in the malaria parasite (Plasmodium falciparum) genome is crucial for understanding its evolution; however the 81% (A+T) genome poses challenges to high-throughput sequencing technologies. We explore the viability of the Roche 454 Genome Sequencer FLX (GS FLX) high throughput sequencing technology for both whole genome sequencing and fine-resolution characterization of genetic exchange in malaria parasites.

Results

We present a scheme to survey recombination in the haploid stage genomes of two sibling parasite clones, using whole genome pyrosequencing that includes a sliding window approach to predict recombination breakpoints. Whole genome shotgun (WGS) sequencing generated approximately 2 million reads, with an average read length of approximately 300 bp. De novo assembly using a combination of WGS and 3 kb paired end libraries resulted in contigs ≤ 34 kb. More than 8,000 of the 24,599 SNP markers identified between parents were genotyped in the progeny, resulting in a marker density of approximately 1 marker/3.3 kb and allowing for the detection of previously unrecognized crossovers (COs) and many non crossover (NCO) gene conversions throughout the genome.

Conclusions

By sequencing the 23 Mb genomes of two haploid progeny clones derived from a genetic cross at more than 30× coverage, we captured high resolution information on COs, NCOs and genetic variation within the progeny genomes. This study is the first to resequence progeny clones to examine fine structure of COs and NCOs in malaria parasites.  相似文献   

19.
20.

Background

Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method.

Results

We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end.

Conclusions

De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号