首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Next generation pyrosequencing of high G + C content genomes still poses problems to automated sequencing and assembly processes which necessitates cost and time intensive manual work in order to finish such genomes completely. The sequencing of the high G + C actinomycete Actinoplanes sp. SE50/110 was performed with standard pyrosequencing technology (454 Life Sciences) and revealed a high number of gaps. The reasons for the introduction of gaps were analyzed on a previously known 41 kb long DNA reference sequence from Actinoplanes sp. SE50/110, hosting the acarbose biosynthesis gene cluster. Mapping of the sequencing results on the reference gene cluster sequence revealed a fragmentation into 30 contiguous sequences of different lengths. The gaps between these sequences were characterized by extremely low read coverage which strongly correlated with the G + C content in the gap regions in a negative manner. Furthermore, the gap-sequences contained strong stem-loop structures which hindered the amplification of these sequences during the emulsion PCR. Being significantly underrepresented or absent in the subsequent sequencing process, these sequences lead to weakly or uncovered genomic regions which forces the assembly algorithm to output multiple contiguous sequences instead of one finished genome. However, by applying a different pyrosequencing protocol, it was possible to sequence the complete acarbose biosynthesis gene cluster. The changes to the protocol include longer read length and addition of chemicals to the amplification chemistry, which reduces the self-annealing of DNA fragments during the amplification process and enables the complete reconstruction of high G + C content genomes without manual intervention.  相似文献   

2.
Component C (Acarviosy-1,4-Glc-1,1-Glc) was a highly structural acarbose analog, which could be largely formed during acarbose fermentation process, resulting in acarbose purification being highly difficult. By choosing osmolality level as the key fermentation parameter of acarbose-producing Actinoplanes sp. A56, this paper successfully established an effective and simplified osmolality-shift strategy to improve acarbose production and concurrently reduce component C formation. Firstly, the effects of various osmolality levels on acarbose fermentation were firstly investigated in a 50-l fermenter. It was found that 400–500 mOsm/kg of osmolality was favorable for acarbose biosynthesis, but would exert a negative influence on the metabolic activity of Actinoplanes sp. A56, resulting in an obviously negative increase of acarbose and a sharp formation of component C during the later stages of fermentation (144–168 h). Based on this fact, an osmolality-shift fermentation strategy (0–48 h: 250–300 mOsm/kg; 49–120 h: 450–500 mOsm/kg; 121–168 h: 250–300 mOsm/kg) was further carried out. Compared with the osmolality-stat (450–500 mOsm/kg) fermentation process, the final accumulation amount of component C was decreased from 498.2 ± 27.1 to 307.2 ± 9.5 mg/l, and the maximum acarbose yield was increased from 3,431.9 ± 107.7 to 4,132.8 ± 111.4 mg/l.  相似文献   

3.

Background

There is a need to characterize genomes of the foodborne pathogen, Salmonella enterica serovar Enteritidis (SE) and identify genetic information that could be ultimately deployed for differentiating strains of the organism, a need that is yet to be addressed mainly because of the high degree of clonality of the organism. In an effort to achieve the first characterization of the genomes of SE of Canadian origin, we carried out massively parallel sequencing of the nucleotide sequence of 11 SE isolates obtained from poultry production environments (n = 9), a clam and a chicken, assembled finished genomes and investigated diversity of the SE genome.

Results

The median genome size was 4,678,683 bp. A total of 4,833 chromosomal genes defined the pan genome of our field SE isolates consisting of 4,600 genes present in all the genomes, i.e., core genome, and 233 genes absent in at least one genome (accessory genome). Genome diversity was demonstrable by the presence of 1,360 loci showing single nucleotide polymorphism (SNP) in the core genome which was used to portray the genetic distances by means of a phylogenetic tree for the SE isolates. The accessory genome consisted mostly of previously identified SE prophage sequences as well as two, apparently full- sized, novel prophages namely a 28 kb sequence provisionally designated as SE-OLF-10058 (3) prophage and a 43 kb sequence provisionally designated as SE-OLF-10012 prophage.

Conclusions

The number of SNPs identified in the relatively large core genome of SE is a reflection of substantial diversity that could be exploited for strain differentiation as shown by the development of an informative phylogenetic tree. Prophage sequences can also be exploited for SE strain differentiation and lineage tracking. This work has laid the ground work for further studies to develop a readily adoptable laboratory test for the subtyping of SE.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-713) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Next Generation DNA Sequencing (NGS) and genome mining of actinomycetes and other microorganisms is currently one of the most promising strategies for the discovery of novel bioactive natural products, potentially revealing novel chemistry and enzymology involved in their biosynthesis. This approach also allows rapid insights into the biosynthetic potential of microorganisms isolated from unexploited habitats and ecosystems, which in many cases may prove difficult to culture and manipulate in the laboratory. Streptomyces leeuwenhoekii (formerly Streptomyces sp. strain C34) was isolated from the hyper-arid high-altitude Atacama Desert in Chile and shown to produce novel polyketide antibiotics.

Results

Here we present the de novo sequencing of the S. leeuwenhoekii linear chromosome (8 Mb) and two extrachromosomal replicons, the circular pSLE1 (86 kb) and the linear pSLE2 (132 kb), all in single contigs, obtained by combining Pacific Biosciences SMRT (PacBio) and Illumina MiSeq technologies. We identified the biosynthetic gene clusters for chaxamycin, chaxalactin, hygromycin A and desferrioxamine E, metabolites all previously shown to be produced by this strain (J Nat Prod, 2011, 74:1965) and an additional 31 putative gene clusters for specialised metabolites. As well as gene clusters for polyketides and non-ribosomal peptides, we also identified three gene clusters encoding novel lasso-peptides.

Conclusions

The S. leeuwenhoekii genome contains 35 gene clusters apparently encoding the biosynthesis of specialised metabolites, most of them completely novel and uncharacterised. This project has served to evaluate the current state of NGS for efficient and effective genome mining of high GC actinomycetes. The PacBio technology now permits the assembly of actinomycete replicons into single contigs with >99 % accuracy. The assembled Illumina sequence permitted not only the correction of omissions found in GC homopolymers in the PacBio assembly (exacerbated by the high GC content of actinomycete DNA) but it also allowed us to obtain the sequences of the termini of the chromosome and of a linear plasmid that were not assembled by PacBio. We propose an experimental pipeline that uses the Illumina assembled contigs, in addition to just the reads, to complement the current limitations of the PacBio sequencing technology and assembly software.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1652-8) contains supplementary material, which is available to authorized users.  相似文献   

5.
Solute receptors (binding proteins) are indispensable components of canonical ATP-binding cassette importers in prokaryotes. Here, we report on the characterization and crystal structures in the closed and open conformations of AcbH, the solute receptor of the putative carbohydrate transporter AcbFG which is encoded in the acarbose (acarviosyl-1,4-maltose) biosynthetic gene cluster from Actinoplanes sp. SE50/110. Binding assays identified AcbH as a high-affinity monosaccharide-binding protein with a dissociation constant (Kd) for β-d-galactopyranose of 9.8 ± 1.0 nM. Neither galactose-containing di- and trisaccharides, such as lactose and raffinose, nor monosaccharides including d-galacturonic acid, l-arabinose, d-xylose and l-rhamnose competed with [14C]galactose for binding to AcbH. Moreover, AcbH does not bind d-glucose, which is a common property of all but one d-galactose-binding proteins characterized to date. Strikingly, determination of the X-ray structure revealed that AcbH is structurally homologous to maltose-binding proteins rather than to glucose-binding proteins. Two helices are inserted in the substrate-binding pocket, which reduces the cavity size and allows the exclusive binding of monosaccharides, specifically β-d-galactopyranose, in the 4C1 conformation. Site-directed mutagenesis of three residues from the binding pocket (Arg82, Asp361 and Arg362) that interact with the axially oriented O4-H hydroxyl of the bound galactopyranose and subsequent functional analysis indicated that these residues are crucial for galactose binding. To our knowledge, this is the first report of the tertiary structure of a solute receptor with exclusive affinity for β-d-galactopyranose. The putative role of a galactose import system in the context of acarbose metabolism in Actinoplanes sp. is discussed.  相似文献   

6.
Abstract

The α-glucosidase inhibitor acarbose produced by Actinoplanes sp. SE50/110 is a pseudotetrasaccharide, which consists of an unsaturated cyclitol (carba-sugar), 4-amino-4,6-dideoxyglucose and maltose. The cyclitol (valienol) and the 4-amino-4,6-dideoxyglucose are linked via an N-glycosidic (imino) bond, forming the so-called acarviosyl moiety, which is primarily responsible for the inhibitory effect on α-glucosidases. The gene cluster encoding the biosynthetic genes for the synthesis of acarbose (acb-genes) was sequenced and 25 open reading frames belonging to the acb-gene cluster were identified. Based on the analysis of the enzymes encoded by the acb-cluster, the biosynthesis and ecological role of acarbose is described. The gene cluster includes genes which encode: proteins for the synthesis of the cyclitol; the enzymes for the synthesis of dTDP-4-amino-4,6-dideoxyglucose; glycosyltransferases for the condensation reactions; ATP-dependent exporters and importers; extracellular starch degrading enzymes; and intracellular acarbose modifying enzymes. Acarbose has a dual role for the producer: it inhibits α-glucosidic enzymes of competitors and functions as a carbophor for the uptake of glucose or starch molecules.  相似文献   

7.

Background

Information transfer systems in Archaea, including many components of the DNA replication machinery, are similar to those found in eukaryotes. Functional assignments of archaeal DNA replication genes have been primarily based upon sequence homology and biochemical studies of replisome components, but few genetic studies have been conducted thus far. We have developed a tractable genetic system for knockout analysis of genes in the model halophilic archaeon, Halobacterium sp. NRC-1, and used it to determine which DNA replication genes are essential.

Results

Using a directed in-frame gene knockout method in Halobacterium sp. NRC-1, we examined nineteen genes predicted to be involved in DNA replication. Preliminary bioinformatic analysis of the large haloarchaeal Orc/Cdc6 family, related to eukaryotic Orc1 and Cdc6, showed five distinct clades of Orc/Cdc6 proteins conserved in all sequenced haloarchaea. Of ten orc/cdc6 genes in Halobacterium sp. NRC-1, only two were found to be essential, orc10, on the large chromosome, and orc2, on the minichromosome, pNRC200. Of the three replicative-type DNA polymerase genes, two were essential: the chromosomally encoded B family, polB1, and the chromosomally encoded euryarchaeal-specific D family, polD1/D2 (formerly called polA1/polA2 in the Halobacterium sp. NRC-1 genome sequence). The pNRC200-encoded B family polymerase, polB2, was non-essential. Accessory genes for DNA replication initiation and elongation factors, including the putative replicative helicase, mcm, the eukaryotic-type DNA primase, pri1/pri2, the DNA polymerase sliding clamp, pcn, and the flap endonuclease, rad2, were all essential. Targeted genes were classified as non-essential if knockouts were obtained and essential based on statistical analysis and/or by demonstrating the inability to isolate chromosomal knockouts except in the presence of a complementing plasmid copy of the gene.

Conclusion

The results showed that ten out of nineteen eukaryotic-type DNA replication genes are essential for Halobacterium sp. NRC-1, consistent with their requirement for DNA replication. The essential genes code for two of ten Orc/Cdc6 proteins, two out of three DNA polymerases, the MCM helicase, two DNA primase subunits, the DNA polymerase sliding clamp, and the flap endonuclease.  相似文献   

8.

Background

Plasmodium chabaudi chabaudi can be considered as a rodent model of human malaria parasites in the genetic analysis of important characters such as drug resistance and immunity. Despite the availability of some genome sequence data, an extensive genetic linkage map is needed for mapping the genes involved in certain traits.

Methods

The inheritance of 672 Amplified Fragment Length Polymorphism (AFLP) markers from two parental clones (AS and AJ) of P. c. chabaudi was determined in 28 independent recombinant progeny clones. These, AFLP markers and 42 previously mapped Restriction Fragment Length Polymorphism (RFLP) markers (used as chromosomal anchors) were organized into linkage groups using Map Manager software.

Results

614 AFLP markers formed linkage groups assigned to 10 of 14 chromosomes, and 12 other linkage groups not assigned to known chromosomes. The genetic length of the genome was estimated to be about 1676 centiMorgans (cM). The mean map unit size was estimated to be 13.7 kb/cM. This was slightly less then previous estimates for the human malaria parasite, Plasmodium falciparum

Conclusion

The P. c. chabaudi genetic linkage map presented here is the most extensive and highly resolved so far available for this species. It can be used in conjunction with the genome databases of P. c chabaudi, P. falciparum and Plasmodium yoelii to identify genes underlying important phenotypes such as drug resistance and strain-specific immunity.  相似文献   

9.

Background

Lactobacillus hokkaidonensis is an obligate heterofermentative lactic acid bacterium, which is isolated from Timothy grass silage in Hokkaido, a subarctic region of Japan. This bacterium is expected to be useful as a silage starter culture in cold regions because of its remarkable psychrotolerance; it can grow at temperatures as low as 4°C. To elucidate its genetic background, particularly in relation to the source of psychrotolerance, we constructed the complete genome sequence of L. hokkaidonensis LOOC260T using PacBio single-molecule real-time sequencing technology.

Results

The genome of LOOC260T comprises one circular chromosome (2.28 Mbp) and two circular plasmids: pLOOC260-1 (81.6 kbp) and pLOOC260-2 (41.0 kbp). We identified diverse mobile genetic elements, such as prophages, integrated and conjugative elements, and conjugative plasmids, which may reflect adaptation to plant-associated niches. Comparative genome analysis also detected unique genomic features, such as genes involved in pentose assimilation and NADPH generation.

Conclusions

This is the first complete genome in the L. vaccinostercus group, which is poorly characterized, so the genomic information obtained in this study provides insight into the genetics and evolution of this group. We also found several factors that may contribute to the ability of L. hokkaidonensis to grow at cold temperatures. The results of this study will facilitate further investigation for the cold-tolerance mechanism of L. hokkaidonensis.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1435-2) contains supplementary material, which is available to authorized users.  相似文献   

10.
[目的] 发现游动放线菌Actinoplanes sp.SE50/110中阿卡波糖生物合成的调控因子,并提高其产量。[方法] 首先,利用DNA亲和层析技术,钓取与阿卡波糖生物合成基因簇2个双向启动子区域结合的调控蛋白。然后,在阿卡波糖产生菌QQ-2中强化表达或敲除这些调控蛋白编码基因,进行体内功能验证。同时,利用大肠杆菌BL21(DE3)异源表达获得可溶性蛋白,通过凝胶阻滞实验验证蛋白与启动子区域的结合能力。[结果] 经DNA亲和层析及蛋白质质谱分析,钓取出9个与双向启动子PWVPAB结合的调控蛋白。在QQ-2中分别强化表达和缺失这9个调控基因后发现,基因ACPL_1889的强化表达使阿卡波糖产量提高25%,而该基因的缺失使产量降低22%;基因ACPL_5445、ACPL_3989的强化表达使阿卡波糖产量分别降低12%和39%,而这两个基因的缺失使产量分别提高15%和8%。对阿卡波糖生物合成基因转录水平的检测发现,强化表达基因ACPL_1889使acbA、acbB、acbW、acbV的转录水平升高,而缺失该基因使这4个基因的转录水平降低;敲除基因ACPL_5445使这4个基因转录水平均有提高;强化表达基因ACPL_3989使这4个基因的转录水平均下降,而其敲除使acbWacbA的转录水平分别提高了约100倍和40倍。在凝胶阻滞实验中,ACPL_1889与ACPL_3989均能与acb基因簇的启动子区域结合。最后将正调控基因的强化表达和负调控基因的敲除进行组合,使阿卡波糖产量提升32%。[结论] 本研究发现了9个与阿卡波糖生物合成基因簇的启动子区域结合的调控蛋白,通过体内、体外实验证明ACPL_1889为阿卡波糖生物合成的正调控因子、ACPL_5445和ACPL_3989为负调控因子,不但为揭示阿卡波糖生物合成的转录调控机制奠定了基础,而且这些调控基因的改造显著提升了阿卡波糖的产量。  相似文献   

11.

Background

The large number of genetic linkage maps representing Brassica chromosomes constitute a potential platform for studying crop traits and genome evolution within Brassicaceae. However, the alignment of existing maps remains a major challenge. The integration of these genetic maps will enhance genetic resolution, and provide a means to navigate between sequence-tagged loci, and with contiguous genome sequences as these become available.

Results

We report the first genome-wide integration of Brassica maps based on an automated pipeline which involved collation of genome-wide genotype data for sequence-tagged markers scored on three extensively used amphidiploid Brassica napus (2n = 38) populations. Representative markers were selected from consolidated maps for each population, and skeleton bin maps were generated. The skeleton maps for the three populations were then combined to generate an integrated map for each LG, comparing two different approaches, one encapsulated in JoinMap and the other in MergeMap. The BnaWAIT_01_2010a integrated genetic map was generated using JoinMap, and includes 5,162 genetic markers mapped onto 2,196 loci, with a total genetic length of 1,792 cM. The map density of one locus every 0.82 cM, corresponding to 515 Kbp, increases by at least three-fold the locus and marker density within the original maps. Within the B. napus integrated map we identified 103 conserved collinearity blocks relative to Arabidopsis, including five previously unreported blocks. The BnaWAIT_01_2010a map was used to investigate the integrity and conservation of order proposed for genome sequence scaffolds generated from the constituent A genome of Brassica rapa.

Conclusions

Our results provide a comprehensive genetic integration of the B. napus genome from a range of sources, which we anticipate will provide valuable information for rapeseed and Canola research.  相似文献   

12.

Background

Cynomolgus macaques (Macaca fascicularis) are a valuable resource for linkage studies of genetic disorders, but their microsatellite markers are not sufficient. In genetic studies, a prerequisite for mapping genes is development of a genome-wide set of microsatellite markers in target organisms. A whole genome sequence and its annotation also facilitate identification of markers for causative mutations. The aim of this study is to establish hundreds of microsatellite markers and to develop an integrative cynomolgus macaque genome database with a variety of datasets including marker and gene information that will be useful for further genetic analyses in this species.

Results

We investigated the level of polymorphisms in cynomolgus monkeys for 671 microsatellite markers that are covered by our established Bacterial Artificial Chromosome (BAC) clones. Four hundred and ninety-nine (74.4%) of the markers were found to be polymorphic using standard PCR analysis. The average number of alleles and average expected heterozygosity at these polymorphic loci in ten cynomolgus macaques were 8.20 and 0.75, respectively.

Conclusion

BAC clones and novel microsatellite markers were assigned to the rhesus genome sequence and linked with our cynomolgus macaque cDNA database (QFbase). Our novel microsatellite marker set and genomic database will be valuable integrative resources in analyzing genetic disorders in cynomolgus macaques.  相似文献   

13.
Ramoplanins produced by Actinoplanes are new structural class of lipopeptide and are currently in phase III clinical trials for the prevention of vancomycin-resistant enterococcal infections. The depsipeptide structures of ramoplanins are synthesized by non-ribosomal peptide synthetases (NRPS). Romo-orf17, a stand-alone NRPS, is responsible for the recruitment of Thr into the linear NRPS pathways for which the corresponding adenylation domain is absent. Here, systematical gene inactivation and complementation have been carried out in a Actinoplanes sp. using homologous recombination and site-specific integration methods. A hybrid gene coding for the N-terminal region of the stand-alone NRPS and the A-PCP domains of a heterologous NRPS restored production of ramoplanins. The results elucidate the unusual N-terminal region which is essential for the biosynthesis of ramoplanins.  相似文献   

14.

Background

Proteolytic Clostridium botulinum is the causative agent of botulism, a severe neuroparalytic illness. Given the severity of botulism, surprisingly little is known of the population structure, biology, phylogeny or evolution of C. botulinum. The recent determination of the genome sequence of C. botulinum has allowed comparative genomic indexing using a DNA microarray.

Results

Whole genome microarray analysis revealed that 63% of the coding sequences (CDSs) present in reference strain ATCC 3502 were common to all 61 widely-representative strains of proteolytic C. botulinum and the closely related C. sporogenes tested. This indicates a relatively stable genome. There was, however, evidence for recombination and genetic exchange, in particular within the neurotoxin gene and cluster (including transfer of neurotoxin genes to C. sporogenes), and the flagellar glycosylation island (FGI). These two loci appear to have evolved independently from each other, and from the remainder of the genetic complement. A number of strains were atypical; for example, while 10 out of 14 strains that formed type A1 toxin gave almost identical profiles in whole genome, neurotoxin cluster and FGI analyses, the other four strains showed divergent properties. Furthermore, a new neurotoxin sub-type (A5) has been discovered in strains from heroin-associated wound botulism cases. For the first time, differences in glycosylation profiles of the flagella could be linked to differences in the gene content of the FGI.

Conclusion

Proteolytic C. botulinum has a stable genome backbone containing specific regions of genetic heterogeneity. These include the neurotoxin gene cluster and the FGI, each having evolved independently of each other and the remainder of the genetic complement. Analysis of these genetic components provides a high degree of discrimination of strains of proteolytic C. botulinum, and is suitable for clinical and forensic investigations of botulism outbreaks.  相似文献   

15.

Background

Understanding how DNA sequence polymorphism relates to variation in gene expression is essential to connecting genotypic differences with phenotypic differences among individuals. Addressing this question requires linking population genomic data with gene expression variation.

Results

Using whole genome expression data and recent light shotgun genome sequencing of six Drosophila simulans genotypes, we assessed the relationship between expression variation in males and females and nucleotide polymorphism across thousands of loci. By examining sequence polymorphism in gene features, such as untranslated regions and introns, we find that genes showing greater variation in gene expression between genotypes also have higher levels of sequence polymorphism in many gene features. Accordingly, X-linked genes, which have lower sequence polymorphism levels than autosomal genes, also show less expression variation than autosomal genes. We also find that sex-specifically expressed genes show higher local levels of polymorphism and divergence than both sex-biased and unbiased genes, and that they appear to have simpler regulatory regions.

Conclusion

The gene-feature-based analyses and the X-to-autosome comparisons suggest that sequence polymorphism in cis-acting elements is an important determinant of expression variation. However, this relationship varies among the different categories of sex-biased expression, and trans factors might contribute more to male-specific gene expression than cis effects. Our analysis of sex-specific gene expression also shows that female-specific genes have been overlooked in analyses that only point to male-biased genes as having unusual patterns of evolution and that studies of sexually dimorphic traits need to recognize that the relationship between genetic and expression variation at these traits is different from the genome as a whole.  相似文献   

16.

Background

The Mycoplasma mycoides cluster consists of five species or subspecies that are ruminant pathogens. One subspecies, Mycoplasma mycoides subspecies mycoides Small Colony (MmmSC), is the causative agent of contagious bovine pleuropneumonia. Its very close relative, Mycoplasma mycoides subsp. capri (Mmc), is a more ubiquitous pathogen in small ruminants causing mastitis, arthritis, keratitis, pneumonia and septicaemia and is also found as saprophyte in the ear canal. To understand the genetics underlying these phenotypic differences, we compared the MmmSC PG1 type strain genome, which was already available, with the genome of an Mmc field strain (95010) that was sequenced in this study. We also compared the 95010 genome with the recently published genome of another Mmc strain (GM12) to evaluate Mmc strain diversity.

Results

The MmmSC PG1 genome is 1,212 kbp and that of Mmc 95010 is ca. 58 kbp shorter. Most of the sequences present in PG1 but not 95010 are highly repeated Insertion Sequences (three types of IS) and large duplicated DNA fragments. The 95010 genome contains five types of IS, present in fewer copies than in PG1, and two copies of an integrative conjugative element. These mobile genetic elements have played a key role in genome plasticity, leading to inversions of large DNA fragments. Comparison of the two genomes suggested a marked decay of the PG1 genome that seems to be correlated with a greater number of IS. The repertoire of gene families encoding surface proteins is smaller in PG1. Several genes involved in polysaccharide metabolism and protein degradation are also absent from, or degraded in, PG1.

Conclusions

The genome of MmmSC PG1 is larger than that of Mmc 95010, its very close relative, but has less coding capacity. This is the result of large genetic rearrangements due to mobile elements that have also led to marked gene decay. This is consistent with a non-adaptative genomic complexity theory, allowing duplications or pseudogenes to be maintained in the absence of adaptive selection that would lead to purifying selection and genome streamlining over longer evolutionary times. These findings also suggest that MmmSC only recently adapted to its bovine host.  相似文献   

17.

Background

The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.

Results

Our finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.

Conclusions

The WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.  相似文献   

18.

Background

The recent determination of the complete nucleotide sequence of several Mycobacterium tuberculosis (MTB) genomes allows the use of comparative genomics as a tool for dissecting the nature and consequence of genetic variability within this species. The multiple alignment of the genomes of clinical strains (CDC1551, F11, Haarlem and C), along with the genomes of laboratory strains (H37Rv and H37Ra), provides new insights on the mechanisms of adaptation of this bacterium to the human host.

Findings

The genetic variation found in six M. tuberculosis strains does not involve significant genomic rearrangements. Most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE family but not with genes implicated in virulence. Using a Perl-based software islandsanalyser, which creates a representation of the genetic variation in the genome, we identified differences in the patterns of distribution and frequency of the polymorphisms across the genome. The identification of genes displaying strain-specific polymorphisms and the extrapolation of the number of strain-specific polymorphisms to an unlimited number of genomes indicates that the different strains contain a limited number of unique polymorphisms.

Conclusion

The comparison of multiple genomes demonstrates that the M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. This observation opens new perspectives into the evolution and the understanding of the pathogenesis of this bacterium.  相似文献   

19.

Background

Entomopathogenic associations between nematodes in the genera Steinernema and Heterorhabdus with their cognate bacteria from the bacterial genera Xenorhabdus and Photorhabdus, respectively, are extensively studied for their potential as biological control agents against invasive insect species. These two highly coevolved associations were results of convergent evolution. Given the natural abundance of bacteria, nematodes and insects, it is surprising that only these two associations with no intermediate forms are widely studied in the entomopathogenic context. Discovering analogous systems involving novel bacterial and nematode species would shed light on the evolutionary processes involved in the transition from free living organisms to obligatory partners in entomopathogenicity.

Results

We report the complete genome sequence of a new member of the enterobacterial genus Serratia that forms a putative entomopathogenic complex with Caenorhabditis briggsae. Analysis of the 5.04 MB chromosomal genome predicts 4599 protein coding genes, seven sets of ribosomal RNA genes, 84 tRNA genes and a 64.8 KB plasmid encoding 74 genes. Comparative genomic analysis with three of the previously sequenced Serratia species, S. marcescens DB11 and S. proteamaculans 568, and Serratia sp. AS12, revealed that these four representatives of the genus share a core set of ~3100 genes and extensive structural conservation. The newly identified species shares a more recent common ancestor with S. marcescens with 99 % sequence identity in rDNA sequence and orthology across 85.6 % of predicted genes. Of the 39 genes/operons implicated in the virulence, symbiosis, recolonization, immune evasion and bioconversion, 21 (53.8 %) were present in Serratia while 33 (84.6 %) and 35 (89 %) were present in Xenorhabdus and Photorhabdus EPN bacteria respectively.

Conclusion

The majority of unique sequences in Serratia sp. SCBI (South African Caenorhabditis briggsae Isolate) are found in ~29 genomic islands of 5 to 65 genes and are enriched in putative functions that are biologically relevant to an entomopathogenic lifestyle, including non-ribosomal peptide synthetases, bacteriocins, fimbrial biogenesis, ushering proteins, toxins, secondary metabolite secretion and multiple drug resistance/efflux systems. By revealing the early stages of adaptation to this lifestyle, the Serratia sp. SCBI genome underscores the fact that in EPN formation the composite end result – killing, bioconversion, cadaver protection and recolonization- can be achieved by dissimilar mechanisms. This genome sequence will enable further study of the evolution of entomopathogenic nematode-bacteria complexes.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1697-8) contains supplementary material, which is available to authorized users.  相似文献   

20.
Sequence and structure of Brassica rapa chromosome A3   总被引:1,自引:0,他引:1  

Background

The species Brassica rapa includes important vegetable and oil crops. It also serves as an excellent model system to study polyploidy-related genome evolution because of its paleohexaploid ancestry and its close evolutionary relationships with Arabidopsis thaliana and other Brassica species with larger genomes. Therefore, its genome sequence will be used to accelerate both basic research on genome evolution and applied research across the cultivated Brassica species.

Results

We have determined and analyzed the sequence of B. rapa chromosome A3. We obtained 31.9 Mb of sequences, organized into nine contigs, which incorporated 348 overlapping BAC clones. Annotation revealed 7,058 protein-coding genes, with an average gene density of 4.6 kb per gene. Analysis of chromosome collinearity with the A. thaliana genome identified conserved synteny blocks encompassing the whole of the B. rapa chromosome A3 and sections of four A. thaliana chromosomes. The frequency of tandem duplication of genes differed between the conserved genome segments in B. rapa and A. thaliana, indicating differential rates of occurrence/retention of such duplicate copies of genes. Analysis of 'ancestral karyotype' genome building blocks enabled the development of a hypothetical model for the derivation of the B. rapa chromosome A3.

Conclusions

We report the near-complete chromosome sequence from a dicotyledonous crop species. This provides an example of the complexity of genome evolution following polyploidy. The high degree of contiguity afforded by the clone-by-clone approach provides a benchmark for the performance of whole genome shotgun approaches presently being applied in B. rapa and other species with complex genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号