首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Variation among Genome Sequences of H37Rv Strains of Mycobacterium tuberculosis from Multiple Laboratories
Authors:Thomas R Ioerger  Yicheng Feng  Krishna Ganesula  Xiaohua Chen  Karen M Dobos  Sarah Fortune  William R Jacobs  Jr  Valerie Mizrahi  Tanya Parish  Eric Rubin  Chris Sassetti  James C Sacchettini
Abstract:The publication of the complete genome sequence for Mycobacterium tuberculosis H37Rv in 1998 has had a great impact on the research community. Nonetheless, it is suspected that genetic differences have arisen in stocks of H37Rv that are maintained in different laboratories. In order to assess the consistency of the genome sequences among H37Rv strains in use and the extent to which they have diverged from the original strain sequenced, we carried out whole-genome sequencing on six strains of H37Rv from different laboratories. Polymorphisms at 73 sites were observed, which were shared among the lab strains, though 72 of these were also shared with H37Ra and are likely to be due to sequencing errors in the original H37Rv reference sequence. An updated H37Rv genome sequence should be valuable to the tuberculosis research community as well as the broader microbial research community. In addition, several polymorphisms unique to individual strains and several shared polymorphisms were identified and shown to be consistent with the known provenance of these strains. Aside from nucleotide substitutions and insertion/deletions, multiple IS6110 transposition events were observed, supporting the theory that they play a significant role in plasticity of the M. tuberculosis genome. This genome-wide catalog of genetic differences can help explain any phenotypic differences that might be found, including a frameshift mutation in the mycocerosic acid synthase gene which causes two of the strains to be deficient in biosynthesis of the surface glycolipid phthiocerol dimycocerosate (PDIM). The resequencing of these six lab strains represents a fortuitous “in vitro evolution” experiment that demonstrates how the M. tuberculosis genome continues to evolve even in a controlled environment.Publication of the whole genome sequence of the H37Rv strain of Mycobacterium tuberculosis by Stewart Cole and colleagues in 1998 provided a breakthrough in tuberculosis (TB) research (8), leading to insights into the biology, metabolism, and evolution of this infectious pathogen. Large protein families related to fatty acid and polyketide biosynthesis, regulation (e.g., sigma factors and two-component sensor systems), drug efflux pumps and transporters, and the PE_PGRS proteins (a large duplicated family unique to the M. tuberculosis group of mycobacteria) were identified. In addition, transposons, prophage-like elements, and other repetitive and/or mobile genetic elements were identified (18). This genomic information has played an essential role in interpreting gene expression studies, modeling persistence, and identifying essential proteins as putative targets for drug discovery. However, to date the functions of only half of the genes (1,756/4,066) have been determined or predicted, and the rest remain annotated as “hypothetical proteins” (6).The H37Rv strain was initially selected for sequencing because it is a widely used laboratory strain that has retained its virulence. H37Rv was initially derived from a clinical isolate, H37, obtained from a patient with pulmonary tuberculosis in 1905. H37Rv falls in the T clade (5) and single-nucleotide polymorphism (SNP) cluster group SCG-6b (12). The virulence of H37Rv can be demonstrated in a number of animal models. For example, SCID mice infected with H37Rv typically have a mean time to death of 30 to 35 days, depending on the dose and route of inoculation (13).An avirulent strain, H37Ra, was also derived from H37 by culturing on solid egg medium and selecting for resistance to lysis (42). The strain was found not to cause disease in guinea pigs (43) or in mice (27). It has a colony morphology (smooth) different from that of H37Rv (rough) and several other phenotypic differences (14, 29). The H37Rv (ATCC 25618) and H37Ra (ATCC 25177) strains are maintained at the Trudeau Institute in New York (3), although unfortunately, the original H37 clinical isolate has been lost. Strain ATCC 27294 (TMC 102) is also frequently used as a representative of H37Rv in studies and treated equivalently in the literature. ATCC 25618 and ATCC 27294 were both isolated from the same patient in different years, and both are fully drug susceptible.The complete genome of H37Ra has been sequenced by Zheng et al. (48), who found 272 polymorphisms compared to the genome sequence determined by Cole et al. (8) for H37Rv. However, a subset of the polymorphic sites were found to match CDC1551, and upon resequencing of 85 such sites in H37Rv, 79 were determined to be errors in the H37Rv reference sequence. In addition, H37Ra has insertions of IS6110 at two novel sites and a loss of one, compared to the 16 sites in H37Rv. The 130 genuine H37Ra-specific polymorphisms found were divided into those in coding regions, those in upstream regulatory regions, and those in noncoding, nonregulatory intergenic regions in order to assess potential relevance to virulence. Polymorphisms in the promoter regions of sigC, nrdH (glutaredoxin-like electron transporter), and pabB (para-amino benzoate synthase), as well as nonsynonymous substitutions in mazG (regulator of stringent response), phoP (two-component sensor regulating biosynthesis of cell surface lipid antigens), pks12 (polyketide synthase involved in biosynthesis of mycoketides), and nrp (nonribosomal peptide synthetase potentially involved in phthiocerol dimycocerosate PDIM] biosynthesis), were highlighted as possible causes of the loss of virulence. H37Ra does not synthesize a number of cell surface antigens, including sulfolipid-1, trehalose mycolates, and PDIM (7). The roles of mutations in phoP and sigM, both of which regulate expression of genes involved in biosynthesis of cell surface antigens, have been subsequently investigated, though neither seems to be singularly responsible for the avirulence of H37Ra (17, 35). Multiple mutations in PPE and PE_PGRS genes are also observed in H37Ra, and there has been speculation about the role of these genes in virulence (39). However, the RvD2 region (an 8-kb region present in H37Ra but deleted in H37Rv, including an IS6110 insertion element, mmpL14, and several hypothetical genes) is known not to be responsible for differences in virulence (25).Because of its importance as a model strain used in laboratory studies, it is essential to determine how consistent different stocks of H37Rv in different laboratories are with the reference genome sequence and with each other. Different stocks could accumulate independent polymorphisms over time, and such inconsistencies could potentially make results of studies obtained with H37Rv cultures from different labs difficult to compare, particularly if they affect virulence, drug tolerance, metabolism, cell wall constitution, etc. Furthermore, sequencing errors in the original genome sequence are possible. In order to evaluate differences among currently used variants of H37Rv, we resequenced the complete genomes of six extant H37Rv strains (two samples of ATCC 25618 and four of ATCC 27294) using Illumina sequencing technology. We compared differences among them and differences from the reference sequences for H37Rv and H37Ra available from GenBank. The results of this study identify a common set of 73 polymorphisms shared among all six sequenced strains relative to the H37Rv reference strain. Most (72) of these are shared with H37Ra and likely correspond to sequencing errors in the original H37Rv genome sequence. However, there are several sites where additional polymorphisms are shared among a subset of strains, and several strains have a small number of unique polymorphisms. Furthermore, examination of insertion sites of the IS6110 transposable element reveals several changes that have occurred among these strains. These results illustrate the ongoing evolution of this strain and divergence from the sequenced reference strain of H37Rv and highlight the importance of understanding the genetic differences unique to the stock used in each laboratory.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号