首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
2.
3.
Vancomycin-resistant enterococci acquire high-level resistance to glycopeptide antibiotics through the synthesis of peptidoglycan terminating in d-alanyl-d-lactate. A key enzyme in this process is a d-alanyl-d-alanine ligase homologue, VanA or VanB, which preferentially catalyzes the synthesis of the depsipeptide d-alanyl-d-lactate. We report the overexpression, purification, and enzymatic characterization of DdlN, a VanA and VanB homologue encoded by a gene of the vancomycin-producing organism Amycolatopsis orientalis C329.2. Evaluation of kinetic parameters for the synthesis of peptides and depsipeptides revealed a close relationship between VanA and DdlN in that depsipeptide formation was kinetically preferred at physiologic pH; however, the DdlN enzyme demonstrated a narrower substrate specificity and commensurately increased affinity for d-lactate in the C-terminal position over VanA. The results of these functional experiments also reinforce the results of previous studies that demonstrated that glycopeptide resistance enzymes from glycopeptide-producing bacteria are potential sources of resistance enzymes in clinically relevant bacteria.The origin of antibiotic resistance determinants is of significant interest for several reasons, including the prediction of the emergence and spread of resistance patterns, the design of new antimicrobial agents, and the identification of potential reservoirs for resistance elements. Antibiotic resistance can occur either through spontaneous mutation in the target or by the acquisition of external genetic elements such as plasmids or transposons which carry resistance genes (7). The origins of these acquired genes are varied, but it has long been recognized that potential reservoirs are antibiotic-producing organisms which naturally harbor antibiotic resistance genes to protect themselves from the actions of toxic compounds (6).High-level resistance to glycopeptide antibiotics such as vancomycin and teicoplanin in vancomycin-resistant enterococci (VRE) is conferred by the presence of three genes, vanH, vanA (or vanB), and vanX, which, along with auxiliary genes necessary for inducible gene expression, are found on transposons integrated into plasmids or the bacterial genome (1, 20). These three genes are essential to resistance and serve to change the C-terminal peptide portion of the peptidoglycan layer from d-alanyl-d-alanine (d-Ala-d-Ala) to d-alanyl-d-lactate (d-Ala-d-Lac). This change results in the loss of a critical hydrogen bond between vancomycin and the d-Ala-d-Ala terminus and in a 1,000-fold decrease in binding affinity between the antibiotic and the peptidoglycan layer, which is the basis for the bactericidal action of this class of compounds (5). The vanH gene encodes a d-lactate dehydrogenase which provides the requisite d-Lac (3, 5), while the vanX gene encodes a highly specific dd-peptidase which cleaves only d-Ala-d-Ala produced endogenously while leaving d-Ala-d-Lac intact (19, 21). The final gene, vanA or vanB, encodes an ATP-dependent d-Ala-d-Lac ligase (4, 8, 10). This enzyme has sequence homology with the chromosomal d-Ala-d-Ala ligases, which are essential for peptidoglycan synthesis but which generally lack the ability to synthesize d-Ala-d-Lac (9).We have recently cloned vanH, vanA, and vanX homologues from two glycopeptide antibiotic-synthesizing organisms: Amycolatopsis orientalis C329.2, which produces vancomycin, and Streptomyces toyocaensis NRRL 15009, which produces A47934 (14). In addition, the vanH-vanA-vanX gene cluster was identified in several other glycopeptide producers. We have also demonstrated that the VanA homologue from S. toyocaensis NRRL 15009 can synthesize d-Ala-d-Lac in vitro and in the glycopeptide-sensitive host Streptomyces lividans (15, 16). We now report the expression of the A. orientalis C329.2 VanA homologue DdlN in Escherichia coli, its purification, and its enzymatic characterization. These data reinforce the striking similarity between vancomycin resistance elements in VRE and glycopeptide-producing organisms and support the possibility of a common origin for these enzymes.

Expression, purification, and specificity of DdlN.

DdlN was overexpressed in E. coli under the control of the bacteriophage T7 promoter. The construct gave good yields of highly purified enzyme following a four-step purification procedure (Table (Table1;1; Fig. Fig.1).1). Like other dd-ligases, DdlN behaved like a dimer in solution (not shown).

TABLE 1

Purification of DdlN from E. coli BL21 (DE3)/pETDdlN
SampleProtein (mg)Activity (nmol/min)Sp act (nmol/ min/mg)Recovery (%)Purification (fold)
Lysate1248436.82100
Ammonium sulfate (20–50% saturation)67.678011.5921.7
Sephacryl S20011.682571.49811
Q Sepharose2.87422658839
Phenyl Superose0.429974835110
Open in a separate windowOpen in a separate windowFIG. 1Purification of DdlN from E. coli BL21 (DE3)/pETDdlN. Proteins were separated on an SDS–11% polyacrylamide gel and stained with Coomassie blue. Lane 1, molecular mass markers (masses are noted at the left in kilodaltons); lane 2, whole-cell lysate; lane 3, ammonium sulfate fraction (20 to 50% saturation); lane 4, Sephacryl S200; lane 5, Q Sepharose; lane 6, phenyl Superose.The amino acid substrate specificity of DdlN was assessed by incubation of 14C-d-Ala with all 20 common amino acids in the d configuration. Purified DdlN catalyzed the synthesis of d-Ala-d-Ala in addition to that of several other mixed dipeptides, including d-Ala-d-Met and d-Ala-d-Phe (Fig. (Fig.2).2). Thus, DdlN exhibits a substrate specificity which is similar to that of VanA (4), with the capacity to synthesize not only d-Ala-d-Ala but also mixed dipeptides with bulky side chains in the C-terminal position.Open in a separate windowFIG. 2Substrate specificity of DdlN. Autoradiogram from thin-layer chromatography analysis of DdlN substrate specificity. All reaction mixtures contained 2.5 mM d-Ala and 1 mM ATP, and the radiolabel was 14C-d-Ala, except where noted. Lane 1, d-Ala; lane 2, d-Lac with 14C-d-Lac label; lane 3, d,l-methionine; lane 4, dl-phenylalanine; lane 5, d-Hbut; lane 6, d-hydroxyvalerate. Letters indicate the following: A, d-Ala-d-Lac; B, d-Lac; C, d-Ala-d-Met; D, d-Ala-d-Phe; E, d-Ala-d-Hbut; F, d-Ala-d-hydroxyvalerate.Importantly, DdlN is a depsipeptide synthase with the ability to synthesize d-Ala-d-Lac, d-Ala-d-hydroxybutyrate (Hbut), and d-Ala-d-hydroxyvalerate (Fig. (Fig.2).2). However, unlike VanA (5), d-hydroxycaproate and d-phenyllactate are not substrates (not shown). Thus, DdlN is a broad-spectrum d-Ala-d-X ligase with depsipeptide synthase activity.

Characterization of d-Ala-d-X ligase activity.

Following the initial assessment of the specificity of the enzyme, several substrates were selected for quantitative analysis by evaluation of their steady-state kinetic parameters (Table (Table2).2). DdlN has two amino acid (or hydroxy acid) Km values. Steady-state kinetic plots indicated that, like other dd-ligases, the N-terminal Km (Km1) was significantly lower (higher specificity) than the C-terminal Km (Km2). Since the former value is expected to be independent of the C-terminal substrate, only Km2 values were determined and are reported here.

TABLE 2

Characterization of steady-state parameters of DdlN and VanA
LigaseSubstrateKm2 (mM)kcat (min−1)kcat/Km2 (M−1 s−1)
DdlNd-Ala21 ± 2229 ± 71.8 × 102
d-Lac0.4 ± 0.0555 ± 12.3 × 103
d-Hbut2.5 ± 0.332 ± 22.1 × 102
ATPa1.2 ± 0.271 ± 50.98 × 102
DdlMbd-Ala166 ± 27
d-Lac1.08 ± 0.10
VanAcd-Ala382951.3 × 102
d-Lac7.1942.2 × 102
d-Hbut0.601083.0 × 103
Open in a separate windowa Determined in the presence of 10 mM d-Lac. b Data from reference 16c Data from reference 5. DdlN showed good d-Ala-d-Ala ligase activity but with a very high and physiologically questionable Km2 (21 mM). On the other hand, d-Ala-d-Lac synthesis was excellent, with a 4-fold decrease in kcat, compared to d-Ala-d-Ala synthesis, which was offset by a 52-fold drop in Km that resulted in a >12-fold increase in specificity (kcat/Km2). d-Hbut was also a good substrate, with a kcat/Km2 comparable to that of d-Ala.Steady-state kinetic parameters for d-Ala-d-X formation showed trends similar to those found with both VanA and DdlN. For example, the kcat values between VanA and DdlN were virtually the same for most substrates. There were significant differences, however. For instance, while the Km2 values for d-Ala were very high for all three enzymes, DdlN does have greater affinity for d-Ala, with a 1.8- and 7.9-fold lower Km2 than those of VanA and DdlM, respectively. Additionally, the Km2 for d-Lac was 17.8- and 2.7-fold lower than those for VanA and DdlM. Thus, DdlN has a more restrictive specificity for the C-terminal residue than VanA, which is compensated for by a higher affinity for the critical substrate d-Lac.

pH dependence of peptide versus that of depsipeptide synthesis activity.

The partitioning of the syntheses of d-Ala-d-Ala and d-Ala-d-Hbut in VanA and other depsipeptide-competent dd-ligases has been shown to be pH dependent (17). Determination of the pH dependence of DdlN in synthesizing peptide versus depsipeptide (Fig. (Fig.3)3) directly paralleled the results obtained with VanA in similar experiments. At lower pHs (<7), d-Ala-d-Hbut synthesis predominates and is exclusive at a pH of <6 (Fig. (Fig.3).3). At pH 7.5, levels of synthesis of d-Ala-d-Hbut and d-Ala-d-Ala are relatively equal, while at a pH greater than 8, the capacity to synthesize peptide overtakes the capacity to synthesize depsipeptide, although the latter is never abolished. Open in a separate windowFIG. 3pH dependence of partitioning of the syntheses of peptide and depsipeptide by DdlN. (A) Autoradiogram of a thin-layer chromatography separation of the products of reaction mixtures containing 14C-D-Ala, unlabeled D-Ala, and d-Hbut. (B) Quantification of reaction products following phosphorimage analysis. Filled circles, D-Ala-d-Hbut; open circles, D-Ala-D-Ala.The partitioning of the formation of peptide versus depsipeptide as a function of pH by DdlM is comparable to that by VanA and depsipeptide-competent mutants of DdlB (17), which show essentially exclusively depsipeptide formation at lower pHs and increasing peptide formation as the pH increases. This implies a potential role for the protonated ammonium group of d-Ala2 in second-substrate recognition and suggests a mechanism for the discrimination between d-Ala and d-Lac at physiologic pH. The structural basis for this distinction remains obscure for DdlB and VanA or DdlN.

Concluding remarks.

Resistance to vancomycin and other glycopeptides is mediated through the synthesis of a peptidoglycan which does not terminate with the canonical d-Ala-d-Ala dipeptide. Thus, enterococci which exhibit the VanC phenotype, which consists of low-level, noninducible resistance to vancomycin only, have peptidoglycan terminating in d-Ala-d-Ser (19). On the other hand, bacteria which are constitutively resistant to high concentrations of glycopeptides, such as lactic acid bacteria and VRE exhibiting the VanA or VanB phenotype (high-level inducible resistance to vancomycin), incorporate the depsipeptide d-Ala-d-Lac into their cell walls (2, 12, 13). The enzymes responsible for the intracellular synthesis of d-Ala-d-Lac not surprisingly have significant amino acid sequence similarity with d-Ala-d-Ala ligases, which are responsible for d-Ala-d-Ala synthesis in all bacteria with a cell wall (9).The d-Ala-d-Lac synthases can be subdivided into two groups based on sequence homology: those found in the constitutively resistant lactic acid bacteria and those found in glycopeptide-producing organisms and VanA or VanB VRE (9, 14). The former have more similarity with exclusive d-Ala-d-Ala ligases. Indeed, single point mutations in d-Ala-d-Ala ligases which yield sequences more similar to those of lactic acid bacterium d-Ala-d-Lac ligases are sufficient to induce significant depsipeptide synthase activity in these enzymes (17). Similarly, mutational studies of the d-Ala-d-Lac ligase from Leuconostoc mesenteroides have demonstrated that the converse also holds (18). On the other hand, the molecular basis for depsipeptide synthesis by the VanA or VanB ligases is unknown, in large part due to the lack of protein structural information on which to base mutational studies, unlike the situation with d-Ala-d-Ala ligases, where the E. coli DdlB structure serves as a template for mechanistic research (11).Significantly, a major difference in the VanA or VanB ligases and other dd-ligases lies in the amino acid sequence of the ω-loop region, which closes off the active site of DdlB (11) and has been shown to contribute amino acid residues with the capacity to control the syntheses of d-Ala-d-Ala and d-Ala-d-Lac, notably, Tyr216 (17, 18). Until recently, the VanA and VanB ligases were exceptional in amino acid structure and had no known homologues. The sequencing of resistance genes from glycopeptide-producing bacteria has uncovered enzymes with >60% homology to VanA or VanB and which are virtually superimposable in the critical ω-loop region (14, 15). One of these, DdlM from S. toyocaensis NRRL 15009, has been shown to have d-Ala-d-Lac ligase ability (15, 16), although no rigorous analysis of this activity has been performed. The results presented here demonstrate that DdlN from the vancomycin producer A. orientalis C329.2 not only is a d-Ala-d-Lac ligase but also has significant functional homology with VanA. It is not known at present if, like S. toyocaensis NRRL 15009 (16), A. orientalis C329.2 also possess a d-Ala-d-Ala-exclusive ligase, though the presence of a vanX gene (14) suggests that it may.These studies demonstrate that DdlN cloned from a vancomycin-producing bacterium is a d-Ala-d-Lac ligase which has not only amino acid sequence homology with the dd-ligases from VRE but also functional homology. Thus, VanA, VanB, DdlN, and DdlM have likely evolved from similar origins. The fact that a vanH-vanA-vanX gene cluster can be found in other glycopeptide producers as well (14) suggests that the genes now found in VRE may have originated in glycopeptide-producing bacteria. Our finding that overexpressed, purified, DdlN shows many enzymatic characteristics similar (though not identical) to those of VanA suggests that the genes from glycopeptide-producing bacteria can be important in elucidating biochemical and protein structural aspects of the VRE proteins.  相似文献   

4.
5.
6.
7.
Modern genomewide association studies are characterized by the problem of “missing heritability.” Epistasis, or genetic interaction, has been suggested as a possible explanation for the relatively small contribution of single significant associations to the fraction of variance explained. Of particular concern to investigators of genetic interactions is how to best represent and define epistasis. Previous studies have found that the use of different quantitative definitions for genetic interaction can lead to different conclusions when constructing genetic interaction networks and when addressing evolutionary questions. We suggest that instead, multiple representations of epistasis, or epistatic “subtypes,” may be valid within a given system. Selecting among these epistatic subtypes may provide additional insight into the biological and functional relationships among pairs of genes. In this study, we propose maximum-likelihood and model selection methods in a hypothesis-testing framework to choose epistatic subtypes that best represent functional relationships for pairs of genes on the basis of fitness data from both single and double mutants in haploid systems. We gauge the performance of our method with extensive simulations under various interaction scenarios. Our approach performs reasonably well in detecting the most likely epistatic subtype for pairs of genes, as well as in reducing bias when estimating the epistatic parameter (ɛ). We apply our approach to two available data sets from yeast (Saccharomyces cerevisiae) and demonstrate through overlap of our identified epistatic pairs with experimentally verified interactions and functional links that our results are likely of biological significance in understanding interaction mechanisms. We anticipate that our method will improve detection of epistatic interactions and will help to unravel the mysteries of complex biological systems.UNDERSTANDING the nature of genetic interactions is crucial to obtaining a more complete picture of complex biological systems and their evolution. The discovery of genetic interactions has been the goal of many researchers studying a number of model systems, including but not limited to Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli (You and Yin 2002; Burch et al. 2003; Burch and Chao 2004; Tong et al. 2004; Drees et al. 2005; Sanjuán et al. 2005; Segre et al. 2005; Pan et al. 2006; Zhong and Sternberg 2006; Jasnos and Korona 2007; St. Onge et al. 2007; Decourty et al. 2008). Recently, high-throughput experimental approaches, such as epistatic mini-array profiles (E-MAPs) and genetic interaction analysis technology for E. coli (GIANT-coli), have enabled the study of epistasis on a large scale (Schuldiner et al. 2005, 2006; Collins et al. 2006, 2007; Typas et al. 2008). However, it remains unclear whether the computational and statistical methods currently in use to identify these interactions are indeed the most appropriate.The study of genetic interaction, or “epistasis,” has had a long and somewhat convoluted history. Bateson (1909) first used the term epistasis to describe the ability of a gene at one locus to “mask” the mutational influence of a gene at another locus (Cordell 2002). The term “epistacy” was later coined by Fisher (1918) to denote the statistical deviation of multilocus genotype values from an additive linear model for the value of a phenotype (Phillips 1998, 2008).These origins are the basis for the two main current interpretations of epistasis. The first, as introduced by Bateson (1909), is the “biological,” “physiological,” or “compositional” form of epistasis, concerned with the influence of an individual''s genetic background on an allele''s effect on phenotype (Cheverud and Routman 1995; Phillips 1998, 2008; Cordell 2002; Moore and Williams 2005). The second interpretation, attributed to Fisher, is “statistical” epistasis, which in its linear regression framework places the phenomenon of epistasis in the context of a population (Wagner et al. 1998; Wade et al. 2001; Wilke and Adami 2001; Moore and Williams 2005; Phillips 2008). Each of these approaches is equally valid in studying genetic interactions; however, confusion still exists about how to best reconcile the methods and results of the two (Phillips 1998, 2008; Cordell 2002; Moore and Williams 2005; Liberman and Feldman 2006; Aylor and Zeng 2008).Aside from the distinction between the statistical and the physiological definitions of epistasis, inconsistencies exist when studying solely physiological epistasis. For categorical traits, physiological epistasis is clear as a “masking” effect. When noncategorical or numerical traits are measured, epistasis is defined as the deviation of the phenotype of the multiple mutant from that expected under independence of the underlying genes.The “expectation” of the phenotype under independence, that is, in the absence of epistasis, is not defined consistently between studies. For clarity, consider epistasis between pairs of genes and, without loss of generality, consider fitness as the phenotype. The first commonly used definition of independence, originating from additivity, defines the effect of two independent mutations to be equal to the sum of the individual mutational effects. A second, motivated by the use of fitness as a phenotype, defines the effect of the two mutations as the product of the individual effects (Elena and Lenski 1997; Desai et al. 2007; Phillips 2008). A third definition of independence has been referred to as “minimum,” where alleles at two loci are independent if the double mutant has the same fitness as the less-fit single mutant. Mani et al. (2008) claim that this has been used when identifying pairwise epistasis by searching for synthetic lethal double mutants (Tong et al. 2001, 2004; Pan et al. 2004, 2006; Davierwala et al. 2005). A fourth is the “Log” definition presented by Mani et al. (2008) and Sanjuan and Elena (2006). The less-frequently used “scaled ɛ” (Segre et al. 2005) measure of epistasis takes the multiplicative definition of independence with a scaling factor.These different definitions of independence are partly due to distinct measurement “scales.” For some traits, a multiplicative definition of independence may be necessary to identify epistasis between two genes, whereas for other traits, additivity may be appropriate (Falconer and Mackay 1995; Wade et al. 2001; Mani et al. 2008; Phillips 2008). An interaction found under one independence definition may not necessarily be found under another, leading to different biological conclusions (Mani et al. 2008).Mani et al. (2008) suggest that there may be an “ideal” definition of independence for all gene pairs for identifying functional relationships. However, it is plausible that different representations of independence for two genes may reflect different biological properties of the relationship (Kupper and Hogan 1978; Rothman et al. 1980). “Two categories of general interest [the additive and multiplicative definitions, respectively] are those in which etiologic factors act interchangeably in the same step in a multistep process, or alternatively act at different steps in the process” (Rothman et al. 1980, p. 468). In some cases, the discovery of epistasis may merely be an artifact of using an incorrect null model (Kupper and Hogan 1978). It may be necessary to represent “independence” differently, resulting in different statistical measures of interactions, for different pairs of genes depending on their functions.Previous studies have suggested that different pairs of loci may have different modes of interaction and have attempted to subclassify genetic interactions into regulatory hierarchies and mutually exclusive “interaction subtypes” to elucidate underlying biological properties (Avery and Wasserman 1992; Drees et al. 2005; St. Onge et al. 2007). We suggest that epistatic relationships can be divided into several subtypes, or forms, corresponding to the aforementioned definitions of independence. As a particular gene pair may deviate from independence according to several criteria, we do not claim that these subtypes are necessarily mutually exclusive. We attempt to select the most likely epistatic subtype that is the best statistical representation of the relationship between two genes. To further subclassify interactions, epistasis among deleterious mutations can take one of two commonly used forms: positive (equivalently alleviating, antagonistic, or buffering) epistasis, where the phenotype of the double mutant is less severe than expected under independence, and negative (equivalently aggravating, synergistic, or synthetic), where the phenotype is more severe than expected (Segre et al. 2005; Collins et al. 2006; Desai et al. 2007; Mani et al. 2008).Another objective of such distinctions is to reduce the bias of the estimator of the epistatic parameter (ɛ), which measures the extent and direction of epistasis for a given gene pair. Mani et al. (2008), assuming that the overall distribution of ɛ should be centered around 0, find that inaccurately choosing a definition of independence can result in increased bias when estimating ɛ. For example, using the minimum definition results in the most severe bias when single mutants have moderate fitness effects, and the additive definition results in the largest positive bias when at least one gene has an extreme fitness defect (Mani et al. 2008). Therefore, it is important to select an optimal estimator for ɛ for each pair of genes from among the subtypes of epistatic interactions.Epistasis may be important to consider in genomic association studies, as a gene with a weak main effect may be identified only through its interaction with another gene or other genes (Frankel and Schork 1996; Culverhouse et al. 2002; Moore 2003; Cordell 2009; Moore and Williams 2009). Epistasis has also been studied extensively in the context of the evolution of sex and recombination. The mutational deterministic hypothesis proposes that the evolution of sex and recombination would be favored by negative epistatic interactions (Feldman et al. 1980; Kondrashov 1994); many other studies have also studied the importance of the form of epistasis (Elena and Lenski 1997; Otto and Feldman 1997; Burch and Chao 2004; Keightley and Otto 2006; Desai et al. 2007; MacCarthy and Bergman 2007). Indeed, according to Mani et al. (2008, p. 3466), “the choice of definition [of epistasis] alters conclusions relevant to the adaptive value of sex and recombination.”Given fitness data from single and double mutants in haploid organisms, we implement a likelihood method to determine the subtype that is the best statistical representation of the epistatic interaction for pairs of genes. We use maximum-likelihood estimation and the Bayesian information criteria (BIC) (Schwarz 1978) with a likelihood-ratio test to select the most appropriate null or epistatic model for each putative interaction. We conduct extensive simulations to gauge the performance of our method and demonstrate that it performs reasonably well under various interaction scenarios. We apply our method to two data sets with fitness measurements obtained from yeast (Jasnos and Korona 2007; St. Onge et al. 2007), whose authors assume only multiplicative epistasis for all interactions. By examining functional links and experimentally validated interactions among epistatic pairs, we demonstrate that our results are biologically meaningful. Studying a random selection of genes, we find that minimum epistasis is more prevalent than both additive and multiplicative epistasis and that the overall distribution of ɛ is not significantly different from zero (as Jasnos and Korona 2007 suggest). For genes in a particular pathway, we advise selecting among fewer epistatic subtypes. We believe that our method of epistatic subtype classification will aid in understanding genetic interactions and their properties.

St. Onge et al. (2007) data set:

St. Onge et al. (2007) examined 26 nonessential genes known to confer resistance to MMS, constructed double-deletion strains for 323 double-mutant strains (all but two of the total possible pairs), and assumed the multiplicative form of epistasis for all interactions (see Methods: Analysis of experimental data). Following these authors, we focus on single- and double-mutant fitnesses measured in the presence of MMS. (For results in the absence of MMS, see File S1 and File S1_2.)Using the resampling method described in Analysis of experimental data and File S1, 222 gene pairs pass the cutoff of having epistasis inferred in at least 900 of 1000 replicates. This does not include 5 synthetic lethal gene pairs. Hypothesis testing and a multiple-testing procedure (for 222 simultaneous hypotheses) are necessary to determine the final epistatic pairs.To select one among the three multiple-testing procedures, we follow St. Onge et al. (2007) and examine gene pairs that share specific functional links (see Analysis of experimental data). The Bonferroni method is likely too conservative, yielding only 25 significantly epistatic pairs with only one functional link among them; alternatively, the pFDR procedure appears to be too lenient in rejecting independence for all 222 pairs. Therefore, we use the FDR procedure (although the number of functional links is not significant) and detect 193 epistatic pairs, of which 5 (2.6%) are synthetic lethals, 19 (9.8%) have additive epistasis, 33 (17.1%) have multiplicative epistasis, and 136 (70.5%) have minimum epistasis (File S1_1). We find 29 gene pairs with positive (alleviating) epistasis and 159 gene pairs with negative (aggravating) epistasis.

TABLE 2

Summary of gene pairs with the indicated epistatic subtypes, inferred using the FDR procedure with the BIC method that considers all three epistatic subtypes and their corresponding null models
Epistatic subtypeStudy SStudy J
All193 (100%)352 (100%)
= −0.060 = −0.001
= −0.096 = −0.059
Additive19 (9.8%)35 (9.9%)
= 0.115* = 0.193***
= 0.131 = 0.188
Multiplicative33 (17.1%)63 (17.9%)
= 0.048 = 0.017
= −0.166 = −0.115
Minimum136 (70.5%)254 (72.2%)
= −0.111*** = −0.032**
= −0.091 = −0.065
Open in a separate windowNumbers are the counts of each type, and percentages are given of the total number of epistatic pairs. The mean () and median () of the epistatic parameter (ɛ) are given for each subtype, with “*” indicating that the mean of ɛ is significantly different from 0 (*, P-value ≤0.05; **, P-value ≤0.01; ***, P-value ≤0.001). Study S refers to the St. Onge et al. (2007) data set, and study J refers to the Jasnos and Korona (2007) data set. (For study S, five of the epistatic pairs are synthetic lethals and are not shown; as a result, percentages do not sum to 100%.)To further validate the use of our method and the FDR procedure, we assess by Fisher''s exact test the significance of an enrichment of both Biological Process and all GO Slim term links among epistatic pairs, neither of which are significant (Gene Ontology Consortium 2000; www.yeastgenome.org; Stark et al. 2006); Table S4]. Although some of the previously unidentified interactions that we identify could be false positives, many are likely to be new discoveries.

TABLE 3

Comparison of validation measures for each data set for different variations of the FDR and BIC procedures, considering only a subset of epistatic subtypes with their corresponding null models: all epistatic subtypes (A, P, and M); only the additive and multiplicative subtypes (A and P); and only the additive (A), only the multiplicative (P), or only the minimum (M) subtype (see text for details)
Subtypes considered in BIC procedure
A, P, MA, PAPM
Study J
No. found (636)352273263231329
Functional links (25)19 (0.0255)*13 (0.2320)11 (0.4689)10 (0.4227)15 (0.2619)
GO Slim terms (Biological Process) (115)69 (0.1573)50 (0.4874)55 (0.0736)44 (0.3534)68 (0.04902)*
GO Slim terms (all) (369)224 (0.0009)*172 (0.01654)*160 (0.1297)146 (0.0273)*213 (0.0003)*
Experimentally identified (3)32123
Study S
No. found (323)193192247171243
Functional links (36)21 (0.6450)29 (0.0041)*34 (0.0031)*29 (0.0003)*24 (0.9256)
GO Slim terms (Biological Process) (283)174 (0.0657)174 (0.03656)*223 (0.0010)*153 (0.1825)213 (0.5534)
GO Slim terms (all) (307)185 (0.2866)182 (0.6926)237 (0.1472)162 (0.6997)231 (0.5908)
Experimentally identified (29)1722242321
Open in a separate windowNumbers in parentheses indicate P-values by Fisher''s exact test. “*” indicates significance. Study J refers to the Jasnos and Korona (2007) data set, and study S refers to the St. Onge et al. (2007) data set measured in the presence of MMS. Numbers in parentheses indicate the total number of tested pairs and the total number of each type of link found in each complete data set.The epistatic subtypes we consider are not necessarily mutually exclusive. To more fully assess the assumptions of our method, we also consider several of the possible subsets of the epistatic subtypes (and their corresponding null models) in our procedure. As the minimum epistatic subtype was the most frequently selected in this data set, we first do not include the minimum null model or the minimum epistatic model in our procedure (i.e., we select from among four rather than six models for a pair; Table S4). However, there are a significant number of epistatic pairs with functional links only when the minimum epistatic subtype is not included (also see Table S4 and Table S5). It is not immediately clear which epistatic subtypes are the most appropriate for these data, although including the minimum subtype may not be appropriate (Mani et al. 2008) (see discussion).Although it may be best to consider fewer epistatic subtypes for this specific data set, we report our results including all three epistatic subtypes and their corresponding null models (St. Onge et al. (2007), although we identify 105 epistatic pairs not identified by the original authors (Figure S4, Table S4). St. Onge et al. (2007) find that epistatic pairs with a functional link have a positively shifted distribution of epistasis. We find no such shift in epistasis values (Figure S5). We also demonstrate [described in application to simulated data: Bias and variance of the epistatic parameter (ɛ)] that our method seems to reduce bias of the epistatic parameter (ɛ) (Table S3).] When considering only a subset of the epistatic subtypes, however, we find to be positive and significantly different from zero (results not shown). See File S1, Figure S6, and Figure S7 for additional discussion of the epistatic pairs we identify.

Jasnos and Korona (2007) data set:

The Jasnos and Korona (2007) data set included 758 yeast gene deletions known to cause growth defects and reports fitnesses of only a sparse subset of all possible gene pairs [≈0.2% of the possible pairwise genotypes, or 639 pairs of ]. Because the authors do not identify epistatic pairs in a hypothesis-testing framework, we cannot explicitly compare our conclusions with theirs.To validate our method, we examine gene pairs that have specific functional links (see methods: Analysis of experimental data). When defining a functional link using GO terms (Gene Ontology Consortium 2000) with <30 genes associated with them, only 1 of 639 tested gene pairs has a functional link. Raising the threshold of associated genes to 50 and 100, the number of tested pairs with functional links rises only to 3 and 9, respectively. Because of the large number of random genes and the sparse number of gene pairs in this data set, we follow Tong et al. (2004) and select GO terms that have associated with them ≤200 genes. Twenty-five of 639 tested pairs then have a functional link.Only the FDR multiple-testing procedure results in a significant enrichment of functional links among epistatic pairs (File S1). With the FDR procedure we find 352 significant epistatic pairs, of which 35 (9.9%) have additive epistasis, 63 (17.9%) have multiplicative epistasis, and 254 (72.2%) have minimum epistasis (File S1_3). These proportions of inferred subtypes suggest that the authors'' original restriction to multiplicative epistasis may be inappropriate. We find 141 gene pairs with positive epistasis and 211 gene pairs with negative epistasis.We do not find a significant number of epistatic pairs with shared GO Slim Biological Process terms (see Analysis of experimental data), but do when considering all shared GO Slim terms (St. Onge et al. (2007) data set, we also consider some of the possible subsets of the three epistatic subtypes (and their corresponding null models) in our model selection procedure (Table S5). In contrast to the St. Onge et al. (2007) data set, using all three epistatic subtypes results in a significant number of epistatic pairs with functional links; this measure is not significant when using any of the other subsets of the subtypes. This suggests that our proposed method with three epistatic subtypes may indeed be the most appropriate for data sets with randomly selected genes.We examined the distribution of the estimated values of the epistatic parameter (ɛ) for all pairs with significant epistasis. Jasnos and Korona (2007), in assuming only multiplicative epistasis, conclude that epistasis is predominantly positive. However, we find that the estimated mean of epistasis is not significantly different from zero (two-sided t-test, P-value = 0.9578; Figure 1 and File S1.Open in a separate windowFigure 1.—Distribution of the epistasis values (ɛ) for significant epistatic pairs in the Jasnos and Korona (2007) data set, determined using the FDR procedure and the BIC method including all three epistatic subtypes and their corresponding null models. Mean of ɛ is −0.0009, with a standard deviation of 0.3177; median value is −0.0587. A similar plot is shown in Figure 3 of Jasnos and Korona (2007).  相似文献   

8.
Peptidoglycan from Deinococcus radiodurans was analyzed by high-performance liquid chromatography and mass spectrometry. The monomeric subunit was: N-acetylglucosamine–N-acetylmuramic acid–l-Ala–d-Glu-(γ)–l-Orn-[(δ)Gly-Gly]–d-Ala–d-Ala. Cross-linkage was mediated by (Gly)2 bridges, and glycan strands were terminated in (1→6)anhydro-muramic acid residues. Structural relations with the phylogenetically close Thermus thermophilus are discussed.The gram-positive bacterium Deinococcus radiodurans is remarkable because of its extreme resistance to ionizing radiation (14). Phylogenetically the closest relatives of Deinococcus are the extreme thermophiles of the genus Thermus (4, 11). In 16S rRNA phylogenetic trees, the genera Thermus and Deinococcus group together as one of the older branches in bacterial evolution (11). Both microorganisms have complex cell envelopes with outer membranes, S-layers, and ornithine-Gly-containing mureins (7, 12, 19, 20, 22, 23). However, Deinococcus and Thermus differ in their response to the Gram reaction, having positive and negative reactions, respectively (4, 14). The murein structure for Thermus thermophilus HB8 has been recently elucidated (19). Here we report the murein structure of Deinococcus radiodurans with similar detail.D. radiodurans Sark (23) was used in the present study. Cultures were grown in Luria-Bertani medium (13) at 30°C with aeration. Murein was purified and subjected to amino acid and high-performance liquid chromatography (HPLC) analyses as previously described (6, 9, 10, 19). For further analysis muropeptides were purified, lyophilized, and desalted as reported elsewhere (6, 19). Purified muropeptides were subjected to plasma desorption linear time-of-flight mass spectrometry (PDMS) as described previously (1, 5, 16, 19). Positive and negative ion mass spectra were obtained on a short linear 252californium time-of-flight instrument (BioIon AB, Uppsala, Sweden). The acceleration voltage was between 17 and 19 kV, and spectra were accumulated for 1 to 10 million fission events. Calibration of the mass spectra was done in the positive ion mode with H+ and Na+ ions and in the negative ion mode with H and CN ions. Calculated m/z values are based on average masses.Amino acid analysis of muramidase (Cellosyl; Hoechst, Frankfurt am Main, Germany)-digested sacculi (50 μg) revealed Glu, Orn, Ala, and Gly as the only amino acids in the muramidase-solubilized material. Less than 3% of the total Orn remained in the muramidase-insoluble fraction, indicating an essentially complete solubilization of murein.Muramidase-digested murein samples (200 μg) were analyzed by HPLC as described in reference 19. The muropeptide pattern (Fig. (Fig.1)1) was relatively simple, with five dominating components (DR5 and DR10 to DR13 [Fig. 1]). The muropeptides resolved by HPLC were collected, desalted, and subjected to PDMS. The results are presented in Table Table11 compared with the m/z values calculated for best-matching muropeptides made up of N-acetylglucosamine (GlucNAc), N-acetylmuramic acid (MurNAc), and the amino acids detected in the murein. The more likely structures are shown in Fig. Fig.1.1. According to the m/z values, muropeptides DR1 to DR7 and DR9 were monomers; DR8, DR10, and DR11 were dimers; and DR12 and DR13 were trimers. The best-fitting structures for DR3 to DR8, DR11, and DR13 coincided with muropeptides previously characterized in T. thermophilus HB8 (19) and had identical retention times in comparative HPLC runs. The minor muropeptide DR7 (Fig. (Fig.1)1) was the only one detected with a d-Ala–d-Ala dipeptide and most likely represents the basic monomeric subunit. The composition of the major cross-linked species DR11 and DR13 confirmed that cross-linking is mediated by (Gly)2 bridges, as proposed previously (20). Open in a separate windowFIG. 1HPLC muropeptide elution patterns of murein purified from D. radiodurans. Muramidase-digested murein samples were subjected to HPLC analysis, and the A204 of the eluate was recorded. The most likely structures for each muroeptide as deduced by PDMS are shown. The position of residues in brackets is the most likely one as deduced from the structures of other muropeptides but could not be formally demonstrated. R = GlucNac–MurNac–l-Ala–d-Glu-(γ)→.

TABLE 1

Calculated and measured m/z values for the molecular ions of the major muropeptides from D. radiodurans
MuropeptideaIonm/z
ΔmbError (%)cMuropeptide composition
Muropeptide abundance (mol%)
CalculatedMeasuredNAGdNAMeGluOrnAlaGly
DR1[M+H]+699.69700.10.410.0611101012.0
DR2[M+H]+927.94928.30.360.041111125.7
DR3[M+Na]+1,006.971,007.50.530.051111133.0
DR4[M+Na]+963.95964.60.650.071111212.5
DR5[M+H]+999.02999.80.780.0811112227.7
[M−H]997.00997.30.300.03
DR6[M+Na]+1,078.51,078.80.750.071111232.4
DR7[M+H]+1,070.091,071.00.900.081111322.2
DR8[M+Na]+1,520.531,521.61.080.071122442.2
DR9[M+Na]+701.64702.10.460.0311f10105.0
DR10[M+H]+1,907.941,907.80.140.0122223410.1
[M−H]1,905.921,906.60.680.04
DR11[M+H]+1,979.011,979.10.090.0122224419.1
[M−H]1,977.001,977.30.300.02
DR12[M+H]+2,887.932,886.5−1.43−0.053333564.4
[M−H]2,885.912,885.8−0.11−0.01
DR13[M+H]+2,959.002,957.8−1.20−0.043333663.6
[M−H]2,956.992,955.9−1.09−0.04
Open in a separate windowaDR5 and DR10 to DR13 were analyzed in both the positive and negative ion modes. Muropeptides DR1 to DR4 and DR6 to DR9 were analyzed in the positive mode only due to the small amounts of sample available. bMass difference between measured and calculated quasimolecular ion values. c[(Measured mass−calculated mass)/calculated mass] × 100. dN-Acetylglucosamine. eN-Acetylmuramitol. f(1→6)Anhydro-N-acetylmuramic acid. Structural assignments of muropeptides DR1, DR2, DR8 to DR10, and DR12 deserve special comments. The low m/z value measured for DR1 (700.1) fitted very well with the value calculated for GlucNAc–MurNAc–l-Ala–d-Glu (699.69). Even smaller was the mass deduced for DR9 from the m/z value of the molecular ion of the sodium adduct (702.1) (Fig. (Fig.2).2). The mass difference between DR1 and DR9 (19.9 mass units) was very close indeed to the calculated difference between N-acetylmuramitol and the (1→6)anhydro form of MurNAc (20.04 mass units). Therefore, DR9 was identified as GlucNAc–(1→6)anhydro-MurNAc–l-Ala–d-Glu (Fig. (Fig.1).1). Muropeptides with (1→6)anhydro muramic acid have been identified in mureins from diverse origins (10, 15, 17, 19), indicating that it might be a common feature among peptidoglycan-containing microorganisms. Open in a separate windowFIG. 2Positive-ion linear PDMS of muropeptide DR9. Muropeptide DR9 was purified, desalted by HPLC, and subjected to PDMS to determine the molecular mass. The masses for the dominant molecular ions are indicated.The measured m/z value for the [M+Na]+ ion of DR8 was 1,521.6, very close to the mass calculated for a cross-linked dimer without one disaccharide moiety (1,520.53) (Fig. (Fig.1;1; Table Table1).1). Such muropeptides, also identified in T. thermophilus HB8 and other bacteria (18, 19), are most likely generated by the enzymatic clevage of MurNAc–l-Ala amide bonds in murein by an N-acetylmuramyl–l-alanine amidase (21). In particular, DR8 could derive from DR11. The difference between measured m/z values for DR8 and DR11 was 478.7, which fits with the mass contribution of a disaccharide moiety (480.5) within the mass accuracy of the instrument.The m/z values for muropeptides DR2, DR10, and DR12 supported the argument for structures in which the two d-Ala residues from the d-Ala–d-Ala C-terminal dipeptide were lost, leaving Orn as the C-terminal amino acid.The position of one Gly residue in muropeptides DR2, DR8, and DR10 to DR13 could not be formally demonstrated. One of the Gly residues could be at either the N- or the C-terminal positions. However, the N-terminal position seems more likely. The structure of the basic muropeptide (DR7), with a (Gly)2 acylating the δ-NH2 group of Orn, suggests that major muropeptides should present a (Gly)2 dipeptide. The scarcity of DR3 and DR6, which unambiguously have Gly as the C-terminal amino acid (Fig. (Fig.1),1), supports our assumption.Molar proportions for each muropeptide were calculated as proposed by Glauner et al. (10) and are shown in Table Table1.1. For calculations the structures of DR10 to DR13 were assumed to be those shown in Fig. Fig.1.1. The degree of cross-linkage calculated was 47.2%. Trimeric muropeptides were rather abundant (8 mol%) and made a substantial contribution to total cross-linkage. However, higher-order oligomers were not detected, in contrast with other gram-positive bacteria, such as Staphylococcus aureus, which is rich in such oligomers (8). The proportion of muropeptides with (1→6)anhydro-muramic acid (5 mol%) corresponded to a mean glycan strand length of 20 disaccharide units, which is in the range of values published for other bacteria (10, 17).The results of our study indicate that mureins from D. radiodurans and T. thermophilus HB8 (19) are certainly related in their basic structures but have distinct muropeptide compositions. In accordance with the phylogenetic proximity of Thermus and Deinococcus (11), both mureins are built up from the same basic monomeric subunit (DR7 in Fig. Fig.1),1), are cross-linked by (Gly)2 bridges, and have (1→6)anhydro-muramic acid at the termini of glycan strands. Most interestingly, Deinococcus and Thermus are the only microorganisms identified at present with the murein chemotype A3β as defined by Schleifer and Kandler (20). Nevertheless, the differences in muropeptide composition were substantial. Murein from D. radiodurans was poor in d-Ala–d-Ala- and d-Ala–Gly-terminated muropeptides (2.2 and 2.4 mol%, respectively) but abundant in Orn-terminated muropeptides (23.8 mol%) and in muropeptides with a peptide chain reduced to the dipeptide l-Ala–d-Glu (18 mol%). In contrast, neither Orn- nor Glu-terminated muropeptides have been detected in T. thermophilus HB8 murein, which is highly enriched in muropeptides with d-Ala–d-Ala and d-Ala–Gly (19). Furthermore, no traces of phenyl acetate-containing muropeptides, a landmark for T. thermophilus HB8 murein (19), were found in D. radiodurans. Cross-linkage was definitely higher in D. radiodurans than in T. thermophilus HB8 (47.4 and 27%, respectively), largely due to the higher proportion of trimers in the former.The similarity in murein basic structure suggests that the difference between D. radiodurans and T. thermophilus HB8 with respect to the Gram reaction may simply be a consequence of the difference in the thickness of cell walls (2, 3, 23). Interestingly, D. radiodurans murein turned out to be relatively simple for a gram-positive organism, possibly reflecting the primitive nature of this genus as deduced from phylogenetic trees (11). Our results illustrate the phylogenetic proximity between Deinococcus and Thermus at the cell wall level but also point out the structural divergences originated by the evolutionary history of each genus.  相似文献   

9.
Lichenysins are surface-active lipopeptides with antibiotic properties produced nonribosomally by several strains of Bacillus licheniformis. Here, we report the cloning and sequencing of an entire 26.6-kb lichenysin biosynthesis operon from B. licheniformis ATCC 10716. Three large open reading frames coding for peptide synthetases, designated licA, licB (three modules each), and licC (one module), could be detected, followed by a gene, licTE, coding for a thioesterase-like protein. The domain structure of the seven identified modules, which resembles that of the surfactin synthetases SrfA-A to -C, showed two epimerization domains attached to the third and sixth modules. The substrate specificity of the first, fifth, and seventh recombinant adenylation domains of LicA to -C (cloned and expressed in Escherichia coli) was determined to be Gln, Asp, and Ile (with minor Val and Leu substitutions), respectively. Therefore, we suppose that the identified biosynthesis operon is responsible for the production of a lichenysin variant with the primary amino acid sequence l-Gln–l-Leu–d-Leu–l-Val–l-Asp–d-Leu–l-Ile, with minor Leu and Val substitutions at the seventh position.Many strains of Bacillus are known to produce lipopeptides with remarkable surface-active properties (11). The most prominent of these powerful lipopeptides is surfactin from Bacillus subtilis (1). Surfactin is an acylated cyclic heptapeptide that reduces the surface tension of water from 72 to 27 mN m−1 even in a concentration below 0.05% and shows some antibacterial and antifungal activities (1). Some B. subtilis strains are also known to produce other, structurally related lipoheptapeptides (Table (Table1),1), like iturin (32, 34) and bacillomycin (3, 27, 30), or the lipodecapeptides fengycin (50) and plipastatin (29).

TABLE 1

Lipoheptapeptide antibiotics of Bacillus spp.
LipopeptideOrganismStructureReference
Lichenysin AB. licheniformisFAa-L-Glu-L-Leu-D-Leu-L-Val-L-Asn-D-Leu-L-Ile51, 52
Lichenysin BFAa-L-Glu-L-Leu-D-Leu-L-Val-L-Asp-D-Leu-L-Leu23, 26
Lichenysin CFAa-L-Glu-L-Leu-D-Leu-L-Val-L-Asp-D-Leu-L-Ile17
Lichenysin DFAa-L-Gln-L-Leu-D-Leu-L-Val-L-Asp-D-Leu-L-IleThis work
Surfactant 86B. licheniformisFAa-L-Glxd-L-Leu-D-Leu-L-Val-L-Asxd-D-Leu-L-Ilee14, 15
L-Val
SurfactinB. subtilisFAa-L-Glu-L-Leu-D-Leu-L-Val-L-Asp-D-Leu-L-Leu1, 7, 49
EsperinB. subtilisFAb-L-Glu-L-Leu-D-Leu-L-Val-L-Asp-D-Leu-L-Leue45
L-Val 
Iturin AB. subtilisFAc-L-Asn-D-Tyr-D-Asn-L-Gln-L-Pro-D-Asn-L-Ser32
Iturin CFAc-L-Asn-D-Tyr-D-Asn-L-Gln-L-Pro-D-Asne-L-Asne34
D-Ser-L-Thr 
Bacillomycin LB. subtilisFAc-L-Asp-D-Tyr-D-Asn-L-Ser-L-Gln-D-Proe-L-Thr3
D-Ser- 
Bacillomycin DFAc-L-Asp-D-Tyr-D-Asn-L-Pro-L-Glu-D-Ser-L-Thr30, 31
Bacillomycin FFAc-L-Asn-D-Tyr-D-Asn-L-Gln-L-Pro-D-Asn-L-Thr27
Open in a separate windowaFA, β-hydroxy fatty acid. The β-hydroxy group forms an ester bond with the carboxy group of the C-terminal amino acid. bFA, β-hydroxy fatty acid. The β-hydroxy group forms an ester bond with the carboxy group of Asp5. cFA, β-amino fatty acid. The β-amino group forms a peptide bond with the carboxy group of the C-terminal amino acid. dOnly the following combinations of amino acid 1 and 5 are allowed: Gln-Asp or Glu-Asn. eWhere an alternative amino acid may be present in a structure, the alternative is also presented. In addition to B. subtilis, several strains of Bacillus licheniformis have been described as producing the lipopeptide lichenysin (14, 17, 23, 26, 51). Lichenysins can be grouped under the general sequence l-Glx–l-Leu–d-Leu–l-Val–l-Asx–d-Leu–l-Ile/Leu/Val (Table (Table1).1). The first amino acid is connected to a β-hydroxyl fatty acid, and the carboxy-terminal amino acid forms a lactone ring to the β-OH group of the lipophilic part of the molecule. In contrast to the lipopeptide surfactin, lichenysins seem to be synthesized during growth under aerobic and anaerobic conditions (16, 51). The isolation of lichenysins from cells growing on liquid mineral salt medium on glucose or sucrose basic has been studied intensively. Antimicrobial properties and the ability to reduce the surface tension of water have also been described (14, 17, 26, 51). The structural elucidation of the compounds revealed slight differences, depending on the producer strain. Various distributions of branched and linear fatty acid moieties of diverse lengths and amino acid variations in three defined positions have been identified (Table (Table11).In contrast to the well-defined methods for isolation and structural characterization of lichenysins, little is known about the biosynthetic mechanisms of lichenysin production. The structural similarity of lichenysins and surfactin suggests that the peptide moiety is produced nonribosomally by multifunctional peptide synthetases (7, 13, 25, 49, 53). Peptide synthetases from bacterial and fungal sources describe an alternative route in peptide bond formation in addition to the ubiquitous ribosomal pathway. Here, large multienzyme complexes affect the ordered recognition, activation, and linking of amino acids by utilizing the thiotemplate mechanism (19, 24, 25). According to this model, peptide synthetases activate their substrate amino acids as aminoacyl adenylates by ATP hydrolysis. These unstable intermediates are subsequently transferred to a covalently enzyme-bound 4′-phosphopantetheinyl cofactor as thioesters. The thioesterified amino acids are then integrated into the peptide product through a stepwise elongation by a series of transpeptidations directed from the amino terminals to the carboxy terminals. Peptide synthetases have not only awakened interest because of their mechanistic features; many of the nonribosomally processed peptide products also possess important biological and medical properties.In this report we describe the identification and characterization of a putative lichenysin biosynthesis operon from B. licheniformis ATCC 10716. Cloning and sequencing of the entire lic operon (26.6 kb) revealed three genes, licA, licB, and licC, with structural patterns common to peptide synthetases and a gene designated licTE, which codes for a putative thioesterase. The modular organization of the sequenced genes resembles the requirements for the biosynthesis of the heptapeptide lichenysin. Based on the arrangement of the seven identified modules and the tested substrate specificities, we propose that the identified genes are involved in the nonribosomal synthesis of the portion of the lichenysin peptide with the primary sequence l-Gln–l-Leu–d-Leu–l-Val–l-Asp–d-Leu–l-Ile (with minor Val and Leu substitutions).  相似文献   

10.
11.
The capacity for phenotypic evolution is dependent upon complex webs of functional interactions that connect genotype and phenotype. Wrinkly spreader (WS) genotypes arise repeatedly during the course of a model Pseudomonas adaptive radiation. Previous work showed that the evolution of WS variation was explained in part by spontaneous mutations in wspF, a component of the Wsp-signaling module, but also drew attention to the existence of unknown mutational causes. Here, we identify two new mutational pathways (Aws and Mws) that allow realization of the WS phenotype: in common with the Wsp module these pathways contain a di-guanylate cyclase-encoding gene subject to negative regulation. Together, mutations in the Wsp, Aws, and Mws regulatory modules account for the spectrum of WS phenotype-generating mutations found among a collection of 26 spontaneously arising WS genotypes obtained from independent adaptive radiations. Despite a large number of potential mutational pathways, the repeated discovery of mutations in a small number of loci (parallel evolution) prompted the construction of an ancestral genotype devoid of known (Wsp, Aws, and Mws) regulatory modules to see whether the types derived from this genotype could converge upon the WS phenotype via a novel route. Such types—with equivalent fitness effects—did emerge, although they took significantly longer to do so. Together our data provide an explanation for why WS evolution follows a limited number of mutational pathways and show how genetic architecture can bias the molecular variation presented to selection.UNDERSTANDING—and importantly, predicting—phenotypic evolution requires knowledge of the factors that affect the translation of mutation into phenotypic variation—the raw material of adaptive evolution. While much is known about mutation rate (e.g., Drake et al. 1998; Hudson et al. 2002), knowledge of the processes affecting the translation of DNA sequence variation into phenotypic variation is minimal.Advances in knowledge on at least two fronts suggest that progress in understanding the rules governing the generation of phenotypic variation is possible (Stern and Orgogozo 2009). The first stems from increased awareness of the genetic architecture underlying specific adaptive phenotypes and recognition of the fact that the capacity for evolutionary change is likely to be constrained by this architecture (Schlichting and Murren 2004; Hansen 2006). The second is the growing number of reports of parallel evolution (e.g., Pigeon et al. 1997; ffrench-Constant et al. 1998; Allender et al. 2003; Colosimo et al. 2004; Zhong et al. 2004; Boughman et al. 2005; Shindo et al. 2005; Kronforst et al. 2006; Woods et al. 2006; Zhang 2006; Bantinaki et al. 2007; McGregor et al. 2007; Ostrowski et al. 2008)—that is, the independent evolution of similar or identical features in two or more lineages—which suggests the possibility that evolution may follow a limited number of pathways (Schluter 1996). Indeed, giving substance to this idea are studies that show that mutations underlying parallel phenotypic evolution are nonrandomly distributed and typically clustered in homologous genes (Stern and Orgogozo 2008).While the nonrandom distribution of mutations during parallel genetic evolution may reflect constraints due to genetic architecture, some have argued that the primary cause is strong selection (e.g., Wichman et al. 1999; Woods et al. 2006). A means of disentangling the roles of population processes (selection) from genetic architecture is necessary for progress (Maynard Smith et al. 1985; Brakefield 2006); also necessary is insight into precisely how genetic architecture might bias the production of mutations presented to selection.Despite their relative simplicity, microbial populations offer opportunities to advance knowledge. The wrinkly spreader (WS) morphotype is one of many different niche specialist genotypes that emerge when experimental populations of Pseudomonas fluorescens are propagated in spatially structured microcosms (Rainey and Travisano 1998). Previous studies defined, via gene inactivation, the essential phenotypic and genetic traits that define a single WS genotype known as LSWS (Spiers et al. 2002, 2003) (Figure 1). LSWS differs from the ancestral SM genotype by a single nonsynonymous nucleotide change in wspF. Functionally (see Figure 2), WspF is a methyl esterase and negative regulator of the WspR di-guanylate cyclase (DGC) (Goymer et al. 2006) that is responsible for the biosynthesis of c-di-GMP (Malone et al. 2007), the allosteric activator of cellulose synthesis enzymes (Ross et al. 1987). The net effect of the wspF mutation is to promote physiological changes that lead to the formation of a microbial mat at the air–liquid interface of static broth microcosms (Rainey and Rainey 2003).Open in a separate windowFigure 1.—Outline of experimental strategy for elucidation of WS-generating mutations and their subsequent identity and distribution among a collection of independently evolved, spontaneously arising WS genotypes. The strategy involves, first, the genetic analysis of a specific WS genotype (e.g., LSWS) to identify the causal mutation, and second, a survey of DNA sequence variation at specific loci known to harbor causal mutations among a collection of spontaneously arising WS genotypes. For example, suppressor analysis of LSWS using a transposon to inactivate genes necessary for expression of the wrinkly morphology delivered a large number of candidate genes (top left) (Spiers et al. 2002). Genetic and functional analysis of these candidate genes (e.g., Goymer et al. 2006) led eventually to the identity of the spontaneous mutation (in wspF) responsible for the evolution of LSWS from the ancestral SM genotype (Bantinaki et al. 2007). Subsequent analysis of the wspF sequence among 26 independent WS genotypes (bottom) showed that 50% harbored spontaneous mutations (of different kinds; see Open in a separate windowFigure 2.—Network diagram of DGC-encoding pathways underpinning the evolution of the WS phenotype and their regulation. Overproduction of c-di-GMP results in overproduction of cellulose and other adhesive factors that determine the WS phenotype. The ancestral SBW25 genome contains 39 putative DGCs, each in principle capable of synthesizing the production of c-di-GMP, and yet WS genotypes arise most commonly as a consequence of mutations in just three DGC-containing pathways: Wsp, Aws, and Mws. In each instance, the causal mutations are most commonly in the negative regulatory component: wspF, awsX, and the phosphodiesterase domain of mwsR (see text).To determine whether spontaneous mutations in wspF are a common cause of the WS phenotype, the nucleotide sequence of this gene was obtained from a collection of 26 spontaneously arising WS genotypes (WSA-Z) taken from 26 independent adaptive radiations, each founded by the same ancestral SM genotype (Figure 1): 13 contained mutations in wspF (Bantinaki et al. 2007). The existence of additional mutational pathways to WS provided the initial motivation for this study.

TABLE 1

Mutational causes of WS
WS genotypeGeneNucleotide changeAmino acid changeSource/reference
LSWSwspFA901CS301RBantinaki et al. (2007)
AWSawsXΔ100-138ΔPDPADLADQRAQAThis study
MWSmwsRG3247AE1083KThis study
WSAwspFT14GI5SBantinaki et al. (2007)
WSBwspFΔ620-674P206Δ (8)aBantinaki et al. (2007)
WSCwspFG823TG275CBantinaki et al. (2007)
WSDwspEA1916GD638GThis study
WSEwspFG658TV220LBantinaki et al. (2007)
WSFwspFC821TT274IBantinaki et al. (2007)
WSGwspFC556TH186YBantinaki et al. (2007)
WSHwspEA2202CK734NThis study
WSIwspEG1915TD638YThis study
WSJwspFΔ865-868R288Δ (3)aBantinaki et al. (2007)
WSKawsOG125TG41VThis study
WSLwspFG482AG161DBantinaki et al. (2007)
WSMawsRC164TS54FThis study
WSNwspFA901CS301RBantinaki et al. (2007)
WSOwspFΔ235-249V79Δ (6)aBantinaki et al. (2007)
WSPawsR222insGCCACCGAA74insATEThis study
WSQmwsR3270insGACGTG1089insDVThis study
WSRmwsRT2183CV272AThis study
WSSawsXC472TQ158STOPThis study
WSTawsXΔ229-261ΔYTDDLIKGTTQThis study
WSUwspFΔ823-824T274Δ (13)aBantinaki et al. (2007)
WSVawsXT74GL24RThis study
WSWwspFΔ149L49Δ (1)aBantinaki et al. (2007)
WSXb???This study
WSYwspFΔ166-180Δ(L51-I55)Bantinaki et al. (2007)
WSZ
mwsR
G3055A
A1018T
This study
Open in a separate windowaP206Δ(8) indicates a frameshift; the number of new residues before a stop codon is reached is in parentheses.bSuppressor analysis implicates the wsp locus (17 transposon insertions were found in this locus). However, repeated sequencing failed to identify a mutation.Here we define and characterize two new mutational routes (Aws and Mws) that together with the Wsp pathway account for the evolution of 26 spontaneously arising WS genotypes. Each pathway offers approximately equal opportunity for WS evolution; nonetheless, additional, less readily realized genetic routes producing WS genotypes with equivalent fitness effects exist. Together our data show that regulatory pathways with specific functionalities and interactions bias the molecular variation presented to selection.  相似文献   

12.
13.
The intraflagellar transport machinery is required for the assembly of cilia. It has been investigated by biochemical, genetic, and computational methods that have identified at least 21 proteins that assemble into two subcomplexes. It has been hypothesized that complex A is required for retrograde transport. Temperature-sensitive mutations in FLA15 and FLA17 show defects in retrograde intraflagellar transport (IFT) in Chlamydomonas. We show that IFT144 and IFT139, two complex A proteins, are encoded by FLA15 and FLA17, respectively. The fla15 allele is a missense mutation in a conserved cysteine and the fla17 allele is an in-frame deletion of three exons. The flagellar assembly defect of each mutant is rescued by the respective transgenes. In fla15 and fla17 mutants, bulges form in the distal one-third of the flagella at the permissive temperature and this phenotype is also rescued by the transgenes. These bulges contain the complex B component IFT74/72, but not α-tubulin or p28, a component of an inner dynein arm, which suggests specificity with respect to the proteins that accumulate in these bulges. IFT144 and IFT139 are likely to interact with each other and other proteins on the basis of three distinct genetic tests: (1) Double mutants display synthetic flagellar assembly defects at the permissive temperature, (2) heterozygous diploid strains exhibit second-site noncomplemention, and (3) transgenes confer two-copy suppression. Since these tests show different levels of phenotypic sensitivity, we propose they illustrate different gradations of gene interaction between complex A proteins themselves and with a complex B protein (IFT172).CILIA and flagella are microtubule-based organelles that are found on most mammalian cells. They provide motility to cells and participate in many sensory processes. Defects in or loss of cilia/flagella cause a variety of human diseases that include polycystic kidney disease, retinal degeneration, infertility, obesity, respiratory defects, left–right axis determination, and polydactyly (Fliegauf et al. 2007). Mouse mutants demonstrate that cilia are essential for viability, neural tube closure, and bone development (Eggenschwiler and Anderson 2007; Fliegauf et al. 2007). Cilia and flagella are also present in protists, algae, moss, and some fungi.The assembly and maintenance of cilia and flagella require intraflagellar transport (IFT) (Kozminski et al. 1995). IFT involves the movement of 100- to 200-nm-long protein particles from the basal body located in the cell body to the tip of the flagella using the heterotrimeric kinesin-2 (anterograde movement) (Kozminski et al. 1995) and movement back to the cell body (retrograde movement) using the cytoplasmic dynein complex (Pazour et al. 1999; Porter et al. 1999). IFT particles change their direction of movement as well as their size, speed, and frequency at the ends of the flagella as they switch from anterograde to retrograde movement (Iomini et al. 2001). Biochemical isolation of IFT particles reveals that they are composed of at least 16 proteins and that these particles can be dissociated into two complexes in vitro by changing the salt concentration (Cole et al. 1998; Piperno et al. 1998). Recent genetic and bioinformatics analysis adds at least 7 more proteins to the IFT particle (Follit et al. 2009) (Eggenschwiler and Anderson 2007).

TABLE 1

Proteins and gene names for the intraflagellar transport particles in Chlamydomonas, C. elegans, and mouse
ProteinMotifChlamydomonas geneC. elegans geneMouse geneReferences to worm and mouse genes
Complex A
IFT144WDFLA15
IFT140WDche-11Qin et al. (2001)
IFT139TRPFLA17dyf-2THM1Efimenko et al. (2006); Tran et al. (2008)
IFT122WDIFTA-1Blacque et al. (2006)
IFT121WDdaf-10Bell et al. (2006)
IFT43
Complex B
IFT172WDFLA11osm-1WimpleHuangfu et al. (2003); Pedersen et al. (2005); Bell et al. (2006)
IFT88TRPIFT88osm-5Tg737/PolarisPazour et al. (2000); Qin et al. (2001)
IFT81Coilift-81CDV1Kobayashi et al. (2007)
IFT80WDche-2Wdr56Fujiwara et al. (1999)
IFT74/72Coilift-74Cmg1Kobayashi et al. (2007)
IFT57/55Coilche-13HippiHaycraft et al. (2003)
IFT54Microtubule binding domain MIP-T3dyf-11Traf3IP1Kunitomo and Iino (2008); Li et al. (2008); Omori et al. (2008); Follit et al. (2009)
IFT52ABC typeBLD1osm-6Ngd2Brazelton et al. (2001); Bell et al. (2006)
IFT46IFT46dyf-6Bell et al. (2006); Hou et al. (2007)
IFT27G proteinNot presentRabl4
IFT25Hsp20Not presentHSP16.1Follit et al. (2009)
IFT22G proteinIFTA-2Rabl5Schafer et al. (2006)
IFT20CoilFollit et al. (2006)
FAP22Cluamp related proteindyf-3Cluamp1Murayama et al. (2005); Follit et al. (2009)
DYF13


dyf-13
Ttc26
Blacque et al. (2005)
Open in a separate window—, no mutant found to date in Chlamydomonas.A collection of temperature-sensitive mutant strains that fail to assemble flagella at the restrictive temperature of 32° was isolated in Chlamydomonas (Huang et al. 1977; Adams et al. 1982; Piperno et al. 1998; Iomini et al. 2001). Analysis of the flagella at 21° permits the measurement of the velocity and frequency of IFT particles in the mutant strains. This analysis suggested that assembly has four phases: recruitment to the basal body, anterograde movement (phases I and II), retrograde movement, and return to the cytoplasm (phases III and IV) (Iomini et al. 2001). Different mutants were classified as defective in these four phases. However, because different alleles of FLA8 were classified as defective in different phases (Iomini et al. 2001; Miller et al. 2005), we combined mutants with IFT defects into just two classes. The first group (phases I and II) includes mutant strains that show decreased anterograde velocities, a decreased ratio of anterograde to retrograde particles, and an accumulation of complex A proteins at the basal body. This group includes mutations in the FLA8 and FLA10 genes, which encode the two motor subunits of kinesin-2 (Walther et al. 1994; Miller et al. 2005), as well as mutations in three unknown genes (FLA18, FLA27, and FLA28). The second group includes mutant strains that show the reciprocal phenotype (phases III and IV); these phenotypes include decreased retrograde velocities, an increased ratio of anterograde to retrograde particles, and an accumulation of complex B proteins in the flagella. With the exception of the FLA11 gene, which encodes IFT172, a component of complex B (Pedersen et al. 2005), the gene products in this class are unknown (FLA2, FLA15, FLA16, FLA17, and FLA24). One might predict that mutations in this group would map to genes that encode complex A or retrograde motor subunits. Interestingly, IFT particles isolated from fla11, fla15, fla16, and fla17-1 flagella show depletion of complex A polypeptides (Piperno et al. 1998; Iomini et al. 2001). The inclusion of IFT172 in this class is explained by the observations that IFT172 plays a role in remodeling the IFT particles at the flagellar tip to transition from anterograde to retrograde movement (Pedersen et al. 2005). The remaining mutant strains do not show obvious defects in velocities, ratios, or accumulation at 21° and may reflect a less severe phenotype at the permissive temperature or a non-IFT role for these genes.Direct interactions occur between components of complex B. IFT81 and IFT74/72 interact to form a scaffold required for IFT complex B assembly (Lucker et al. 2005). IFT57 and IFT20 also interact with each other and kinesin-2 (Baker et al. 2003). While physical interactions are being used to define IFT particle architecture, genetic interactions among loci encoding IFT components should be instructive regarding their function as well. To probe retrograde movement and its function, we have identified the gene products encoded by two retrograde defective mutant strains. They are FLA15 and FLA17 and encode IFT144 and IFT139, respectively. The genetic interactions of these loci provide interesting clues about the assembly of the IFT particles and possible physical interactions in the IFT particles.  相似文献   

14.
The correlation coefficient is commonly used as a measure of the divergence of gene expression profiles between different species. Here we point out a potential problem with this statistic: if measurement error is large relative to the differences in expression, the correlation coefficient will tend to show high divergence for genes that have relatively uniform levels of expression across tissues or time points. We show that genes with a conserved uniform pattern of expression have significantly higher levels of expression divergence, when measured using the correlation coefficient, than other genes, in a data set from mouse, rat, and human. We also show that the Euclidean distance yields low estimates of expression divergence for genes with a conserved uniform pattern of expression.IT is now possible to measure the expression levels of thousands of genes in multiple tissues at multiple times. This has led to investigations into the evolution of gene expression and how the pattern of expression changes on a genomic scale. In some analyses, the evolution of expression is considered only within one tissue, but in many studies the evolution across multiple tissues is investigated. In this latter case, the evolution of an expression profile—a vector of expression levels of a gene across several tissues—is considered.Several different statistics have been proposed to measure the divergence between gene expression profiles. The two most popular measures are the Euclidean distance (Jordan et al. 2005; Kim et al. 2006; Yanai et al. 2006; Urrutia et al. 2008) and Pearson''s correlation coefficient (Makova and Li 2003; Huminiecki and Wolfe 2004; Yang et al. 2005; Kim et al. 2006; Liao and Zhang 2006a,b; Xing et al. 2007; Urrutia et al. 2008). The correlation coefficient is often subtracted from one, so that the statistic varies from zero, when there has been no expression divergence, to a maximum of two; we refer to this statistic as the Pearson distance. Here we describe a significant shortcoming of the Pearson distance that is not shared by the Euclidean distance.To investigate properties of these two measures of expression divergence, we compiled a data set of 2859 orthologous genes from human, mouse, and rat for which we had microarray expression data from nine homologous tissues: bone marrow, heart, kidney, large intestine, pituitary, skeletal muscle, small intestine, spleen, and thymus). The expression data for rat came from Walker et al. (2004), the mouse data from Su et al. (2004), and the human data from Ge et al. (2005). Each tissue experiment had two replicates in mouse, a varying number of replicates in rat, and one in humans; some genes were also matched by multiple probe sets. To obtain an average across experiments and probe sets we processed the data as follows:
  1. Raw CEL files of gene expression levels were obtained from the NCBI Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/projects/geo/).
  2. The results from the mouse, rat, and human arrays were normalized separately using both the MAS5 (Affymetrix 2001) and the RMA algorithms (Irizarry et al. 2003) as implemented in Bioconductor (Gentleman et al. 2004). The results are qualitatively similar for the two normalization procedures, although recent analyses suggest that MAS5 normalization is generally better (Ploner et al. 2005; Lim et al. 2007).
  3. The expression of each gene within a tissue was averaged across experiments and probe sets.
We computed expression distances (ED) between orthologous gene expression profiles, for each of the three species comparisons, rat–mouse, rat–human, and mouse–human, according to the two different distance metrics, the Euclidean distance and the Pearson distance:(1)Here xij is the expression level of the gene under consideration in species i in tissue j, and is the average expression level of the gene in species i across tissues. Expression levels are known in a total of k tissues.Because expression levels are measured on different microarray platforms in the three species, we compute relative abundance (RA) values, before calculating the Euclidean distance (Liao and Zhang 2006a). The RA is the expression of a gene in a particular tissue divided by the sum of the expression values of that gene across all tissues. We calculated RA values to remove “probe” effects (the tendency for a gene to bind its probe set on one platform more efficiently than on another platform). Because of probe effects it is not easy to distinguish absolute changes in expression and differences in binding efficiency. Calculating RA values removes this problem from the Euclidean distance. Pearson''s distance does not change under such a rescaling and so this is unnecessary.In some analyses the logarithm of the expression or RA values are used (e.g., Makova and Li 2003; Kim et al. 2006; Xing et al. 2007), and in others the expression values are used without this transformation (e.g., Huminiecki and Wolfe 2004; Jordan et al. 2005; Yang et al. 2005; Liao and Zhang 2006a,b; Yanai et al. 2006; Urrutia et al. 2008). We calculated both the Pearson and the Euclidean distances on log-transformed and untransformed expression values. The results are qualitatively similar so here we present only the results obtained using the logarithm of the expression or RA values.It is natural to expect the two measures of expression divergence to be positively correlated with one another; however, the Euclidean and Pearson distances are almost completely uncorrelated (MAS5 normalization, mouse–rat correlation coefficient = 0.06, human–rat r = 0.13, human–mouse r = 0.10; RMA normalization, mouse–rat correlation coefficient = −0.12, human–rat r = −0.00, human–mouse r = −0.08; Figure 1). This could, plausibly, be because the two statistics measure different aspects of divergence. However, irrespective of this, there is a potential problem associated with the Pearson distance. Imagine that we have a gene that is expressed at identical levels in all tissues in two species (i.e., expression levels are uniform between tissues and also between species). We quite reasonably assume that measured expression levels contain noise. Thus each measured expression level (xij) is the sum of the (assumed) uniform expression level and an independent random number representing noise. In this case there is no real divergence in the expression profile between the species. However, the two measures of divergence may differ greatly in this case. The Euclidean distance reflects only the noise present in the data and hence will be small if the noise is small. By contrast, the Pearson distance will have a value close to 1 since the second term in PeaD in Equation 1 will be close to zero, reflecting the fact that the noise components of different expression levels are independent. Thus the Pearson distance will give the impression that expression divergence is great, but all this apparent divergence is noise. This will be a problem with Pearson''s distance whenever measurement error is of the same magnitude as the differences in expression between tissues. This will therefore tend to be a problem for lowly expressed genes, where measurement error can be large relative to the true value.Open in a separate windowFigure 1.—The correlation between the Euclidean and Pearson distances for (a) mouse–rat, (b) human–rat, and (c) human–mouse. Only the results from MAS5 normalization are shown; qualitatively similar results were obtained with RMA.The above example is unrealistic because real gene expression profiles are rarely perfectly uniform. To investigate whether this shortcoming of the Pearson distance is a problem in real data sets, we determined genes with a relatively uniform pattern of expression in all three species considered above. To do this we computed the entropy of a gene''s expression, which is a measure of uniformity in expression across tissues (Schug et al. 2005): the higher the value of the entropy, the more uniform is the expression. We calculated the entropy for each gene in each of the three species, averaged these across species, and then took those genes in the upper quartile of mean entropy values as a data set of genes with a relatively conserved pattern of uniform expression.It is natural to expect those genes with a conserved uniform pattern of expression to have relatively low expression divergence; however, on average these genes have significantly higher Pearson distances than other genes (Figure 2; supporting information, Figure S1 and Figure S2). By contrast, the Euclidean distance shows the pattern one would anticipate; all of the conserved uniform genes have low expression divergence. It therefore seems likely that the Pearson distance is sensitive to measurement error and hence may not be a good measure of expression divergence.Open in a separate windowFigure 2.—The distribution of expression divergence values for those genes with a uniform pattern of expression that is conserved across species vs. the distribution for all genes for (a) Pearson and (b) Euclidean distances for mouse–rat. We present similar values for human–mouse and human–rat in Figure S1 and Figure S2. Only the results from MAS5 normalization are shown; qualitatively similar results were obtained with RMA.

TABLE 1

The median expression divergence for genes that have a conserved uniform pattern of expression (upper quartile of mean entropy values) vs. all other genes
Data setStatisticConserved uniform genesOther genesWilcoxon test P-value
MAS5 normalization
    Mouse–ratEuclidean1.662.79<10−15
Pearson0.700.47<10−15
    Human–mouseEuclidean1.673.13<10−15
Pearson0.780.58<10−15
    Human–ratEuclidean1.833.21<10−15
Pearson0.780.58<10−15
RMA normalization
    Mouse–ratEuclidean0.591.40<10−15
Pearson0.820.38<10−15
    Human–mouseEuclidean0.591.58<10−15
Pearson0.810.48<10−15
    Human–ratEuclidean0.581.55<10−15

Pearson
0.73
0.50
<10−15
Open in a separate windowWe note that there are two additional advantages of the Euclidean distance. First, it can take into account differences in the absolute level of expression if those data are available, either because the method of assay allows this, for example, if ESTs, SAGE, sequencing, or RNA-Seq data are used, or because expression in the two species has been assessed on the same platform using probes that are conserved between the two species. Second, the square of the Euclidean distance is expected to increase linearly with time. Khaitovich et al. (2004) have previously shown that the squared difference in log expression level increases linearly with time under a Brownian motion model of gene expression evolution. It is therefore expected that the squared Euclidean distance will increase with time since the squared Euclidean distance is the sum of the squared differences across tissues. We prove this in File S1; we also show that this linearity holds, approximately, when relative abundance values are used (see also Pereira et al. 2009).  相似文献   

15.
Sylvain Glémin 《Genetics》2010,185(3):939-959
GC-biased gene conversion (gBGC) is a recombination-associated process mimicking selection in favor of G and C alleles. It is increasingly recognized as a widespread force in shaping the genomic nucleotide landscape. In recombination hotspots, gBGC can lead to bursts of fixation of GC nucleotides and to accelerated nucleotide substitution rates. It was recently shown that these episodes of strong gBGC could give spurious signatures of adaptation and/or relaxed selection. There is also evidence that gBGC could drive the fixation of deleterious amino acid mutations in some primate genes. This raises the question of the potential fitness effects of gBGC. While gBGC has been metaphorically termed the “Achilles'' heel” of our genome, we do not know whether interference between gBGC and selection merely has practical consequences for the analysis of sequence data or whether it has broader fundamental implications for individuals and populations. I developed a population genetics model to predict the consequences of gBGC on the mutation load and inbreeding depression. I also used estimates available for humans to quantitatively evaluate the fitness impact of gBGC. Surprising features emerged from this model: (i) Contrary to classical mutation load models, gBGC generates a fixation load independent of population size and could contribute to a significant part of the load; (ii) gBGC can maintain recessive deleterious mutations for a long time at intermediate frequency, in a similar way to overdominance, and these mutations generate high inbreeding depression, even if they are slightly deleterious; (iii) since mating systems affect both the selection efficacy and gBGC intensity, gBGC challenges classical predictions concerning the interaction between mating systems and deleterious mutations, and gBGC could constitute an additional cost of outcrossing; and (iv) if mutations are biased toward A and T alleles, very low gBGC levels can reduce the load. A robust prediction is that the gBGC level minimizing the load depends only on the mutational bias and population size. These surprising results suggest that gBGC may have nonnegligible fitness consequences and could play a significant role in the evolution of genetic systems. They also shed light on the evolution of gBGC itself.GC-BIASED gene conversion (gBGC) is increasingly recognized as a widespread force in shaping genome evolution. In different species, gene conversion occurring during double-strand break recombination repair is thought to be biased toward G and C alleles. In heterozygotes, GC alleles undergo a kind of molecular meiotic drive that mimics selection (reviewed in Marais 2003). This process can rapidly increase the GC content, especially around recombination hotspots (Spencer et al. 2006), and, more broadly, can affect genome-wide nucleotide landscapes (Duret and Galtier 2009a). For instance, it is thought to play a role in shaping isochore structure evolution in mammals (Galtier et al. 2001; Meunier and Duret 2004; Duret et al. 2006) and birds (Webster et al. 2006). Direct experimental evidence of gBGC mainly comes from studies in yeast (Birdsell 2002; Mancera et al. 2008; but see Marsolier-Kergoat and Yeramian 2009) and humans (Brown and Jiricny 1987). However, associations between recombination and the nucleotide landscape and frequency spectra biased toward GC alleles provide indirect evidence in very diverse organisms (
OrganismsDirect evidenceIndirect evidenceAchille''s heel evidenceReferences
YeastMeiotic segregation biasMancera et al. (2008)
Mitotic and mitotic heteromismatch correction biasCorrelation between GC and recombinationBirdsell (2002)
MammalsMitotic heteromismatch correction biasBrown and Jiricny (1987)
Correlation between GC*/GC and recombinationDuret and Arndt (2008); Meunier and Duret (2004)
Biased frequency spectrum toward GC allelesGaltier et al. (2001); Spencer et al. (2006)
GC bias associated with high dN/dS near recombination hotspotBerglund et al. (2009; Galtier et al. (2009)
BirdsCorrelation between GC and recombinationInternational Chicken Genome Sequencing Consortium (2004)
TurtlesCorrelation between GC and chromosome sizeKuraku et al. (2006)
DrosophilaCorrelation between GC and recombinationMarais et al. (2003)
Biased frequency spectrum toward GC allelesGaltier et al. (2006)
NematodesCorrelation between GC and recombinationMarais et al. (2001)
GrassesCorrelation between GC and outcrossing/selfingGlémin et al. (2006)
Correlation between GC* and recombination and outcrossing/selfingOutcrossing increases dN/dS for genes with high GC*Haudry et al. (2008)
Green algaeCorrelation between GC and recombinationJancek et al. (2008)
ParameciumCorrelation between GC and chromosome sizeDuret et al. (2008)
Open in a separate windowThe impact of gBGC on noncoding sequences and synonymous sites has been studied in depth, especially because of confounding effects with selection on codon usage (Marais et al. 2001). More recently, Galtier and Duret (2007) pointed out that gBGC may also interfere with selection when affecting functional sequences. They argued that gBGC could leave spurious signatures of adaptive selection and proposed to extend the null hypothesis of molecular evolution. Indeed, gBGC can lead to a ratio of nonsynonymous (dN) over synonymous (dS) substitutions above one (Berglund et al. 2009; Galtier et al. 2009), i.e., a typical signature of positive selection (Nielsen 2005). This hypothesis has been widely debated for human-accelerated regions (HARs). These regions are extremely conserved across mammals but show evidence of accelerated evolution along the human lineage, which has been interpreted as evidence of positive selection (Pollard et al. 2006a,b; Prabhakar et al. 2006, 2008). On the contrary, other authors argued that patterns observed in HARs, such as the AT → GC substitution bias, the absence of a selective sweep signature, or the propensity to occur within or close to recombination hotspots, are more likely explained by gBGC rather than positive selection (Galtier and Duret 2007; Berglund et al. 2009; Duret and Galtier 2009b; but see also Pollard et al. 2006a who also suggested that gBGC might play a role in HARs evolution). It is thus crucial to take gBGC into account when interpreting genomic data.Moreover, Galtier and Duret (2007) initially suggested that gBGC hotspots could contribute to the fixation of slightly deleterious AT → GC mutations and could represent the Achilles'' heel of our genome. This hypothesis was reinforced later in primates, with evidence of gBGC-driven fixation of deleterious mutations in proteins (Galtier et al. 2009). A similar result was also found in some grass species, whose genomes are also supposed to be affected by gBGC (Glémin et al. 2006). Haudry et al. (2008) compared two outcrossing and two selfing grass species and showed that GC-biased genes exhibit higher dN/dS ratio in outcrossing than in selfing lineages. The reverse pattern would be expected under pure selective models because of the reduced selection efficacy in selfers (Charlesworth 1992; Glémin 2007). This pattern is in agreement with a genomic Achilles'' heel associated with outcrossing, while gBGC is inefficient in selfing species because they are mainly homozygous.Twenty years ago, Bengtsson (1990) already pointed out that biased conversion can generally affect the mutation load. The mutation load is the reduction in the mean fitness of a population due to mutation accumulation, which could lead to population extinction if it is too high (Lynch et al. 1995). At this time, Bengtsson concluded that “it is impossible to know if biased conversion plays a major role in determining the magnitude of the mutation load in organisms such as ourselves, but the possibility must be considered and further investigated (Bengtsson 1990, p. 186).” Now, one can propose gBGC could be such a widespread biased conversion process. It thus appears timely to thoroughly investigate the fitness consequences of gBGC through its potential effects on the dynamics of deleterious mutations. The fitness consequences of gBGC were also pointed out as a major future issue to be addressed by Duret and Galtier (2009a). In addition to the load, deleterious mutations have many other evolutionary consequences (for review see Charlesworth and Charlesworth 1998). They are thought to be the main determinant of inbreeding depression, i.e., the reduction in fitness of inbred individuals compared to outbred ones. They also play a key role in the evolution of genetic systems (sexual reproduction and recombination, inbreeding avoidance mechanisms, ploidy cycles), of senescence, or in the degeneration of nonrecombining regions, such as Y chromosomes. So far, we know little, if anything, about how gBGC might affect these processes.In his seminal work, Bengtsson (1990) did not address several important points. First, he did not include genetic drift in his model. Nearly neutral mutations, for which drift and selection are of similar intensities, are the most damaging ones because they can drift to fixation, unlike strongly deleterious mutations that are maintained at low frequency (Crow 1993; Lande 1994, 1998). While gBGC intensities are rather weak (Birdsell 2002; Spencer et al. 2006), they could markedly affect the fate of nearly neutral mutations (see also Galtier et al. 2009). Second, Bengtsson did not study the effect of gene conversion on inbreeding depression, while he showed that recessive mutations, mostly involved in inbreeding depression, are the most affected by gene conversion. Third, he did not envisage systematic GC bias with its opposite effects on A/T and G/C deleterious alleles. Fourth, while he noted that selfing affects both the efficacy of selection and that of conversion, he did not fully investigate the effect of mating systems. On one hand, selfing is efficient in purging strongly deleterious mutations causing inbreeding depression. However, since selfing is expected to increase drift, weakly deleterious mutations can fix in selfing species, contributing to the so-called “drift load” (Charlesworth 1992; Glémin 2007). Self-fertilizing populations are thus expected to exhibit low inbreeding depression and high drift load. On the other hand, gBGC, and thus its cost, vanishes as the selfing rate and homozygosity increase (Marais et al. 2004). gBGC could thus challenge classical views on mating systems and it was even speculated that gBGC could affect their evolution (Haudry et al. 2008).Here I present a population genetics model that includes mutation, selection, drift, and gBGC, which extends previous studies (Gutz and Leslie 1976; Lamb and Helmi 1982; Nagylaki 1983a,b; Bengtsson 1990). I specifically examine how gBGC can affect inbreeding depression and the mutation load. I also focus on the effect of mating system, which is especially interesting with regard to the interaction between biased conversion and selection. Finally, I discuss how these results could give insight into how gBGC evolved.

Impacts of gBGC on inbreeding depression:

Inbreeding depression is defined as the reduction in fitness of selfed (and more generally inbred) individuals compared to outcrossed individuals,(15)where and are the mean fitness of outcrosses and selfcrosses, respectively (Charlesworth and Charlesworth 1987; Charlesworth and Willis 2009). The approximation is very good in most conditions, because under weak (s ≪ 1) and strong selection (x ≪ 1) (see Glémin et al. 2003). Similar to the load, considering both sites for which either S or W alleles are deleterious, in proportion q and 1 – q, respectively, we get(16)
gBGC and the genetic basis of inbreeding depression in panmictic populations:
In infinite panmictic populations without gBGC, inbreeding depression depends only on mutation rates and dominance levels. Partially recessive mutations () contribute only to inbreeding depression, and the more recessive they are, the higher the inbreeding depression (Charlesworth and Charlesworth 1987). In finite populations, deterministic results hold for strongly deleterious mutations (s ≫ 1/Ne), which contribute mostly to inbreeding depression. Contrary to the load, weakly deleterious mutations (∼s ≤ 1/Ne) contribute little to inbreeding depression (Figure 4, a and c, and see Bataillon and Kirkpatrick 2000).Open in a separate windowFigure 4.—Inbreeding depression (×106) as a function of s without (a and c) or with (b and d) gBGC (b = 0.0002). (a and b) h = 0.2: thick lines, N = 5000; thin lines, N = 10,000; dashed lines, N = 50,000; dotted lines, N = 100,000. (c and d) N = 10,000: thick lines, h = 0.4; thin lines, h = 0.2; dashed lines, h = 0.1; dotted lines, h = 0.05. u = 10−6, λ = 2.Like the load, gBGC affects both the magnitude and the structure of inbreeding depression. In infinite populations, and more generally for strongly deleterious alleles (Nes ≫ 1), replacing x by xeq given by Equations 4 in Equations 15 and 16 leads to(17a)(17b)(17c)The effect of gBGC on inbreeding depression is not monotonic. Like the load, gBGC increases inbreeding depression if b > hs(1 − 2q/(q + λ − qλ)). However, contrary to the load, a strong gBGC decreases inbreeding depression, which tends to 0 as b increases, while the load tends to qs (Equation 10c). An analysis of Equation 17b shows that mutations that maximize inbreeding depression are those that also maximize the load, i.e., S deleterious mutations with s ≈ 2b.In finite populations, inbreeding depression must be integrated over the Φ distribution, which leads to(18)(see also Glémin et al. 2003). While it is not possible to get an analytical expression of (18), numerical computations (see appendix b) show that S deleterious mutations with s ≈ 2b also maximize inbreeding depression in finite populations (Figure 4). More broadly, inbreeding depression is maximal under the overdominant-like selection regime (gray area in Figure 2). Once again, even low to moderate gBGC markedly affects the genetic structure of inbreeding depression. First, mutations of intermediate effects contribute the most to inbreeding depression, i.e., up to one order of magnitude higher than strongly deleterious mutations (compare Figure 4a with 4b). Second, even nearly additive mutations can have a substantial effect (compare Figure 4c with 4d).Since little is known about the distribution of dominance coefficients, especially the dominance of mildly deleterious mutations (of the order of b), it is difficult to quantitatively predict the full impact of gBGC on inbreeding depression. We can conclude that, on average, gBGC should increase inbreeding depression. However, further insight into mutational parameters is crucial to assess the quantitative impact of gBGC.

Joint effect of gBGC and mating system on the load and inbreeding depression:

Selfing, or more generally inbreeding, slightly reduces the segregating load through the purging of recessive mutations (Ohta and Cockerham 1974), but can substantially increase the fixation load because of the effective population size reduction under inbreeding: (see above and Pollak 1987; Nordborg 1997; Glémin 2007). In numerical examples, I assumed that α decreases with F according to the background selection model (Charlesworth et al. 1993; Nordborg et al. 1996), as in Glémin (2007). With gBGC, selfing thus has two opposite effects on the fixation load. Selfing increases the drift load sensu stricto but decreases the fixation load due to gBGC. A surprising consequence is that the load can be higher in outcrossing than in selfing populations (Figure 5). Quantitatively this is also expected, even with a gBGC hotspot affecting just 3% of the genome (Figure 5 and Open in a separate windowFigure 5.—Effective population size (a and b) and the load (×106) (c–f) as a function of F for different gBGC intensities (thick lines, b = 0; thin lines, b = 0.0001; dashed lines, b = 0.0002; dotted lines, b = 0.0005). The effective population size depends on F under the background selection (BS) model (Charlesworth et al. 1993), using Equations 16 and 17 in Glémin (2007): , where U is the genomic deleterious mutation rate, R is the genomic recombination rate, sd is the mean selection coefficient against strongly deleterious mutations, and hd is their dominance coefficient. N = 10,000, U = 0.2, hd = 0.1, and sd = 0.05. (a, c, and e) R = 5, “weak” BS; (b, d, and f) R = 0.5, “strong” BS. (c and d) Load averaged over half GC and half AT deleterious alleles, with a bias in favor of AT alleles. (e and f) Load averaged over 10% of GC deleterious alleles and 90% of AT deleterious alleles with a bias in favor of AT alleles; see Figure 3. h = 0.5, u = 10−6, and λ = 2.Generally, the effect of selfing is simpler for inbreeding depression. Purging, Ne reduction, and suppression of gBGC contribute to decreasing inbreeding depression in selfing populations (Figure 6a). However, there are special cases in which maximum inbreeding depression is reached for intermediate selfing rates (Figure 6b). In such cases, in outcrossing populations, gBGC is strong enough to sweep polymorphism out and reduce inbreeding depression (b > s, regime 1 in Figure 2). As the selfing rate increases, gBGC declines, and the selection dynamics become overdominant-like (regime 2, Figure 2), thus maximizing inbreeding depression. For high selfing rates, gBGC vanishes (regime 3 in Figure 2) and deleterious alleles are either purged or fixed if there is substantial drift. This is similar to the effect of selfing on inbreeding depression caused by asymmetrical overdominance, where inbreeding depression also peaks for intermediate selfing rates (Ziehe and Roberds 1989; Charlesworth and Charlesworth 1990). In the present case, the range of parameters leading to this peculiar behavior is narrow because the overdominant-like region depends on the selfing rates and can vanish either for low or for high selfing rates (Figure 2).Open in a separate windowFigure 6.—Inbreeding depression (×106) as a function of F for different gBGC intensities (thick lines, b = 0; thin lines, b = 0.0001; dashed lines, b = 0.0002; dotted lines, b = 0.0005). Inbreeding depression is averaged over half GC and half AT deleterious alleles. The effective population size depends on F as in Figure 5 (same parameters). (a) s = 0.002; (b) s = 0.0005; (c) s = 0.0002. h = 0.2, u = 10−6, and λ = 2.

Minimum load and the evolution of gBGC and recombination landscapes:

Although gBGC may have deleterious fitness consequences, it is surprising that it evolved in many taxa (Duret and Galtier 2009a). Birdsell (2002) initially suggested that gBGC may have evolved as a response to mutational bias toward AT (λ > 1, here). Indeed, I show that a minimum load is reached for weak gBGC (b ≈ ln(λ)/4N, Equation 14). This result is very general whatever the distribution of fitness effects of mutations (appendix d). However, the range of optimal gBGC is narrow, and gBGC increases the load as far as b > ln(λ)/2N (appendix c). In humans, using N = 10,000 and λ = 2, gBGC levels that minimize the load are ∼1.17 × 10−5, i.e., one order of magnitude lower than the average bias observed in recombination hotspots (Myers et al. 2005). However, selection on conversion modifiers will not necessarily minimize the load because of gametic disequilibrium generated between modifiers and fitness loci (Bengtsson and Uyenoyama 1990). Selection for limitation of somatic AT-biased mutations could also have selected for GC-biased mismatch repair machinery (Brown and Jiricny 1987). If the bias level that would be selected for somatic reasons is >ln(λ)/2N, a side effect would be the generation of a substantial load at the population level. Finally, it is interesting to note that when synonymous codon positions are under selection for translation accuracy, optimal gBGC levels can be higher than gBGC levels that minimize the protein load, especially when most optimal codons end in G or C ().Conversely, gBGC could also affect the evolution of recombination landscapes, which could evolve to reduce the gBGC load. Surprisingly, for a given recombination/conversion level, the hotspot distribution does not appear to be optimal (Nishant and Rao 2005), one can speculate that the hotspot localization outside genes could be a response to avoid the deleterious effects of gBGC.Up to now, these verbal arguments have not been assessed theoretically (but see Bengtsson and Uyenoyama 1990 for a different kind of conversion bias). Population genetics models are necessary to test these hypotheses concerning the evolution of gBGC and recombination landscapes and to pinpoint the key parameters that might govern their evolution.

gBGC and the evolution of mating systems:

Deleterious mutations also play a crucial role in the evolution of mating systems. They are the main source of inbreeding depression, which balances the automatic advantage of selfing. The drift load is also thought to contribute to the extinction of selfing species. Since they are mainly homozygous, selfing species are mostly free from gBGC and its deleterious impacts. I discuss below how this might affect the evolution of mating systems.
Inbreeding depression and the shift in mating systems:
Inbreeding depression plays a key role in the evolution of mating systems (Charlesworth and Charlesworth 1987; Charlesworth 2006b). Since it balances the automatic advantage of selfing, high inbreeding depression favors outcrossing, while selfing can evolve when it is low. Moreover, selfing helps to purge strongly deleterious mutations, thus decreasing inbreeding depression. This positive feedback reinforces the disruptive selection on the selfing rate and prevents the transition from selfing to outcrossing (Lande and Schemske 1985).Theoretical results suggest that, in most conditions, gBGC would reinforce inbreeding depression in outcrossing populations (Figure 6), which would prevent the evolution of selfing. In reverse, if selfing is initially selected for, recurrent selfing would reduce the load through both purging and avoidance of gBGC. Under this scenario, gBGC would reinforce disruptive selection on mating systems. However, under some conditions (see Figure 6), inbreeding depression peaks at intermediate selfing rates, as observed for asymmetrical overdominance (Ziehe and Roberds 1989; Charlesworth and Charlesworth 1990). In theory, this could prevent the shift toward complete selfing and maintain stable mixed mating systems (Charlesworth and Charlesworth 1990; Uyenoyama and Waller 1991). However, this pattern is observed under restrictive conditions and it is very unlikely on the whole-genome scale. Dominance patterns are crucial for predicting inbreeding depression, especially with gBGC. Contrary to the load, it is thus difficult to evaluate the quantitative impact of gBGC on inbreeding depression. However, increased inbreeding depression in outcrossing species subject to gBGC seems to be the most likely scenario.
gBGC and the long-term evolution of mating systems:
In the long term, the gBGC-induced load also challenges the “dead-end hypothesis,” which posits that, because of the reduction of selection efficacy, self-fertilizing species would accumulate weakly deleterious mutations in the long term, eventually leading to extinction (Takebayashi and Morrell 2001). Because of gBGC, not drift, outcrossing species could also accumulate a load of weakly deleterious mutations (Figure 7), and they could suffer from a higher load than highly self-fertilizing species (Haudry et al. (2008) found that in two outcrossing grass species, but not in two self-fertilizing ones, the dN/dS ratio is significantly higher for genes exhibiting GC enrichment. They speculated that substitutions in these genes might contribute to increasing the load in these two outcrossing grass species. Such results are still very sparse. In plants, evidence of strong gBGC is mainly restricted to grasses (but see Wright et al. 2007). It will be necessary to conduct more in-depth studies to assess the phylogenetic distribution of gBGC in plants and other hermaphrodite organisms and to further test the genomic Achilles'' heel hypothesis in relation to mating systems. While theoretically possible, the quantitative effect of gBGC on the evolution of mating systems remains a new, open, and challenging question.

Conclusion:

I showed that the interaction between gBGC and selection might have surprising qualitative consequences on load and inbreeding depression patterns. Given the few quantitative data available on gBGC levels and selection intensities (mainly in humans), it turns out that even weak genome-wide gBGC can have significant fitness impacts. gBGC should be taken into account not only for sequence analyses (Berglund et al. 2009; Galtier et al. 2009), but also for its potential fitness consequences, for instance concerning genetic diseases. Interferences between gBGC and selection also give rise to new questions on the evolution of mating systems. However, most of the challenging conclusions given here have yet to be quantitatively evaluated. Quantification of gBGC and its interaction with selection in various organisms will be crucial in the future.  相似文献   

16.
Variation in Genomic Recombination Rates Among Heterogeneous Stock Mice          下载免费PDF全文
Beth L. Dumont  Karl W. Broman  Bret A. Payseur 《Genetics》2009,182(4):1345-1349
We used a large panel of pedigreed, genetically admixed house mice to study patterns of recombination rate variation in a leading mammalian model system. We found considerable inter-individual differences in genomic recombination rates and documented a significant heritable component to this variation. These findings point to clear variation in recombination rate among common laboratory strains, a result that carries important implications for genetic analysis in the house mouse.THE rate of recombination—the amount of crossing over per unit DNA—is a key parameter governing the fidelity of meiosis. Recombination rates that are too high or too low frequently give rise to aneuploid gametes or prematurely arrest the meiotic cell cycle (Hassold and Hunt 2001). As a consequence, recombination rates should experience strong selective pressures to lie within the range defined by the demands of meiosis (Coop and Przeworski 2007). Nonetheless, classical genetic studies in Drosophila (Chinnici 1971; Kidwell 1972; Brooks and Marks 1986), crickets (Shaw 1972), flour beetles (Dewees 1975), and lima beans (Allard 1963) have shown that considerable inter-individual variation for recombination rate is present within populations. Recent studies examining the transmission of haplotypes in human pedigrees have corroborated these findings (Broman et al. 1998; Kong et al. 2002; Coop et al. 2008).Here, we use a large panel of heterogeneous stock (HS) mice to study variation in genomic recombination rates in a genetic model system. These mice are genetically admixed, derived from an initial generation of pseudorandom mating among eight common inbred laboratory strains (DBA/2J, C3H/HeJ, AKR/J, A/J, BALB/cJ, CBA/J, C57BL/6J, and LP/J), followed by >50 generations of pseudorandom mating in subsequent hybrid cohorts (Mott et al. 2000; Demarest et al. 2001). The familial relationships among animals in recent generations were tracked to organize the mice into pedigrees. In total, this HS panel includes ∼2300 animals comprising 85 families, 8 of which span multiple generations. The remainder consists of nuclear families (sibships) that range from 1 to 34 sibs, with an average of 9.6 sibs (Valdar et al. 2006) (Mott et al. 2000; Demarest et al. 2001; Shifman et al. 2006).

TABLE 1

Heterogeneous stock mouse pedigrees
PedigreePedigree classNo. of nonoverlapping sibships in the pedigreeNo. of retained sibshipsNo. of meioses
1Multigenerational1717464
2Multigenerational2720728
3Multigenerational2319602
4Multigenerational149254
5Multigenerational119242
6Multigenerational5368
7Multigenerational43100
8Multigenerational2116
9Sibshipa2120
32–85Sibship511146
Total1801323640
Open in a separate windowaThis family was composed of two sibships sharing a common mother but with different fathers.With the exception of several founding individuals, most of these HS mice have been genotyped at 13,367 single nucleotide polymorphisms (SNPs) across the genome (available at http://gscan.well.ox.ac.uk/). Although the publicly available HS genotypes have passed data quality filters (Shifman et al. 2006), we took several additional measures to ensure the highest possible accuracy of base calls. First, data were cleansed of all non-Mendelian inheritances, and genotypes with quality scores <0.4 were removed. Genotypes that resulted in tight (<10 cM in sex-specific distance) double recombinants were also omitted because strong positive crossover interference in the mouse renders such closely spaced crossovers biologically very unlikely (Broman et al. 2002). A total of 10,195 SNPs (including 298 on the X chromosome) passed these additional quality control criteria; the results presented below consider only this subset of highly accurate (>99.98%) and complete (<0.01% missing) genotypes. The cleaned data are publicly available (at http://cgd.jax.org/mousemapconverter/).We used the chrompic program within CRI-MAP (Lander and Green 1987; Green et al. 1990) to estimate the number of recombination events in parental meioses. The algorithm implemented in chrompic first phases parent and offspring genotypes using a maximum-likelihood approach. Next, recombination events occurring in the parental germline are identified by comparing parent and offspring haplotypes across the genome (Green et al. 1990). For example, a haplotype that first copies from one maternal chromosome and then switches to copying from the other maternal chromosome signals a recombination event in the maternal germline.chrompic is very memory intensive and cannot handle the multigenerational pedigrees and the large sibships included in the HS panel. To circumvent these computational limitations, several modifications to the data were implemented. First, the eight multigenerational pedigrees were split into 102 nonoverlapping sibships, retaining grandparental information when available (Cox et al. 2009). Finally, large sibships were subdivided: sibships with >13 progeny were split into two groups: those with >26 progeny were split into three groups and those with >39 sibs were split into four groups. Partitioning large sibships by units of 10, 11, or 12, rather than 13, had no effect on the estimation of crossover counts, suggesting that the estimates were robust to the unit of subdivision. These subdivided families were used only for haplotype inference; all other analyses treated whole sibships as focal units. In total, we analyzed 132 nonoverlapping sibships, ranging in size from 2 to 48 sibs (mean = 13.9). This data set encompassed 3640 meioses—300–2000% more meioses than previously studied human pedigrees (Broman et al. 1998; Kong et al. 2002; Coop et al. 2008)—providing excellent power to detect recombination rate variation among individuals.The recombination rate for the maternal (or paternal) parent of a given sibship was estimated as the average number of recombination events in the haploid maternal (or paternal) genomes transmitted to her (or his) offspring. Our analyses treat males and females separately, as previous observations in mice (Murray and Snell 1945; Mallyon 1951; Reeves et al. 1990; Dietrich et al. 1996; Shifman et al. 2006; Paigen et al. 2008), along with findings from this study, point to systematically higher recombination rates in female than in male mice (this study: P < 2.2 × 10−16, Mann–Whitney U-Test comparing autosomal crossover counts in the 131 HS females to those in the 131 HS males).There is considerable recombination rate heterogeneity among the 131 mothers and 131 fathers in the HS pedigrees (Figure 1). The female with the highest recombination rate had an average of nearly twice as many crossovers per meiosis compared with the lowest (female range: 9.0–17.3; mean = 13.3; SD = 3.28). Similarly, the least actively recombining male had only 55% the amount of recombination as the male with the highest recombination rate (male range: 7.7–14.7; mean = 11.7; SD = 2.76). These average values are similar to previously reported recombination counts in house mice, determined using both cytological (Dumas and Britton-Davidian 2002; Koehler et al. 2002) and genetic (Dietrich et al. 1996) approaches. Note that the recombination rates that we report reflect the number of exchange events visible in genetic data. Under the assumption of no chromatid interference, the expected number of crossovers that occur at meiosis is equal to twice these values.Open in a separate windowFigure 1.—Variation in recombination frequency in HS mice. The mean number of recombination events per transmitted gamete in each mother (A; n = 131) and father (B; n = 131) was inferred by comparing parent and offspring genotypes at >10,000 autosomal and X-linked markers using the CRIMAP chrompic computer program. Error bars span ±2 SEs.To test for variation in recombination within the HS females and within the HS males, we performed a one-way ANOVA using parental identity as the factor and the recombination count for a single haploid genome transmission on the pedigree as the response variable. Significance of the resultant F-statistic was empirically assessed by randomizing parental identity with respect to individual recombination counts, recomputing the F-statistic on the permuted data set, and determining the quantile position of the observed F-statistic along the distribution of 106 F-statistics derived from randomization. There is highly significant variation for genomic recombination rate among HS females (F = 1.7842, P < 10−6; Figure 1A) and males (F = 2.3103, P < 10−6; Figure 1B).We next examined patterns of recombination rate inheritance using the eight complex families to test for heritability of this trait. We fit a polygenic model of inheritance using the polygenic command within SOLAR v.4, accounting for the uneven relatedness among individuals through a matrix of pairwise coefficients of relatedness (Almasy and Blangero 1998). Sex was included as a covariate in the model to account for the well-established differences between male and female recombination rates in mice (Murray and Snell 1945; Mallyon 1951; Reeves et al. 1990; Dietrich et al. 1996; Shifman et al. 2006; Paigen et al. 2008). Recombination rates show significant narrow-sense heritability (h2 = 0.46; SE = 0.20; P = 0.008), indicating that variation for recombination rate among HS mice is partly attributable to additive genetic variation. This result agrees with previous evidence for genetic effects on recombination rate variation in the house mouse (Reeves et al. 1990; Shiroishi et al. 1991; Koehler et al. 2002).In summary, we have shown that HS mice differ significantly in their genomic recombination rates and have demonstrated that this variation is heritable. These findings indicate that interstrain variation for genomic average recombination rate exists among at least two of the eight progenitor strains of the HS stock, mirroring observations of significant variation among inbred laboratory strains for many other quantitative characters (Grubb et al. 2009). Indeed, cytological analyses have already revealed significant differences in recombination frequencies between A/J and C57BL/6J males (Koehler et al. 2002), two of the HS founding strains.This interstrain variation in genomic recombination rate carries important practical implications for genetic analysis in the house mouse. Most notably, crosses using inbred mouse strains with high recombination rates will provide higher mapping resolution than crosses using strains with reduced recombination rates. However, the strategic use of high-recombination-rate strains will not necessarily expedite the fine mapping of loci. The distribution of recombination events in mice is not uniform across chromosomes and appears to be strain specific (Paigen et al. 2008; Grey et al. 2009; Parvanov et al. 2009).The history of the classical inbred mouse strains as inferred from pedigrees (Beck et al. 2000), sequence comparisons to wild mice (Salcedo et al. 2007), and genomewide phylogenetic analyses (Frazer et al. 2007; Yang et al. 2007) suggests that much of the interstrain variation for recombination rate arises from genetic polymorphism among Mus domesticus individuals in nature. However, many other factors have likely shaped recombination rate variation among the classical strains, including inbreeding, artificial selection, and hybridization with closely related species (Wade and Daly 2005). These aspects of the laboratory mouse''s history challenge comparisons between recombination rate variation in the HS panel and human populations and provide strong motivation for studies of recombination rate variation in natural populations of house mice.Although we find a strong genetic component to inter-individual variation in recombination rate, a large fraction (∼54%) of the phenotypic variation for recombination is not explained by additive genetic variation alone. Sampling error and other forms of genetic variation (e.g., dominance and epistasis) likely combine to account for some of the residual variation. In addition, micro-environmental differences within the laboratory setting (Koren et al. 2002) and life history differences among families, including parental age (Koehler et al. 2002; Kong et al. 2004), might contribute to variation in recombination rates among the HS mice.Identifying the genetic loci that underlie recombination rate differences among the HS mice (and hence in the eight founding inbred strains) presents a logical next step in the research program initiated here. The complicated pedigree structure, relatively small number of animals with recombination rate estimates (n = 262), and potentially sex-specific genetic architecture of this trait (Kong et al. 2008; Paigen et al. 2008) will pose challenges to this analysis. Nonetheless, dissecting the genetic basis of recombination rate variation is a pursuit motivated by its potential to lend key insights into several enduring questions. Why do males and females differ in the rate and distribution of crossover events? What are the evolutionary mechanisms that give rise to intraspecific polymorphism and interspecific divergence for recombination rate? What are the functional consequences of recombination rate variation? Alternative experimental approaches, including those that combine the power of QTL mapping with immunocytological assays for measuring recombination rates in situ (Anderson et al. 1999), promise to offer additional clues onto the genetic mechanisms that give rise to variation in this important trait.  相似文献   

17.
l-Valine Production during Growth of Pyruvate Dehydrogenase Complex- Deficient Corynebacterium glutamicum in the Presence of Ethanol or by Inactivation of the Transcriptional Regulator SugR     
Bastian Blombach  Annette Arndt  Marc Auchter  Bernhard J. Eikmanns 《Applied and environmental microbiology》2009,75(4):1197-1200
  相似文献   

18.
Reciprocal Silencing,Transcriptional Bias and Functional Divergence of Homeologs in Polyploid Cotton (Gossypium)     
Bhupendra Chaudhary  Lex Flagel  Robert M. Stupar  Joshua A. Udall  Neetu Verma  Nathan M. Springer  Jonathan F. Wendel 《Genetics》2009,182(2):503-517
  相似文献   

19.
Ectopic Overproduction of a Sporulation-Specific Transcription Factor Induces Assembly of Prespore-Like Membranous Compartments in Vegetative Cells of Fission Yeast          下载免费PDF全文
Yukiko Nakase  Aiko Hirata  Chikashi Shimoda  Taro Nakamura 《Genetics》2009,183(3):1195-1199
  相似文献   

20.
Sequence Analysis of the GntII (Subsidiary) System for Gluconate Metabolism Reveals a Novel Pathway for l-Idonic Acid Catabolism in Escherichia coli     
Christoph Bausch  Norbert Peekhaus  Cristina Utz  Tessa Blais  Elizabeth Murray  Todd Lowary  Tyrrell Conway 《Journal of bacteriology》1998,180(14):3704-3710
The presence of two systems in Escherichia coli for gluconate transport and phosphorylation is puzzling. The main system, GntI, is well characterized, while the subsidiary system, GntII, is poorly understood. Genomic sequence analysis of the region known to contain genes of the GntII system led to a hypothesis which was tested biochemically and confirmed: the GntII system encodes a pathway for catabolism of l-idonic acid in which d-gluconate is an intermediate. The genes have been named accordingly: the idnK gene, encoding a thermosensitive gluconate kinase, is monocistronic and transcribed divergently from the idnD-idnO-idnT-idnR operon, which encodes l-idonate 5-dehydrogenase, 5-keto-d-gluconate 5-reductase, an l-idonate transporter, and an l-idonate regulatory protein, respectively. The metabolic sequence is as follows: IdnT allows uptake of l-idonate; IdnD catalyzes a reversible oxidation of l-idonate to form 5-ketogluconate; IdnO catalyzes a reversible reduction of 5-ketogluconate to form d-gluconate; IdnK catalyzes an ATP-dependent phosphorylation of d-gluconate to form 6-phosphogluconate, which is metabolized further via the Entner-Doudoroff pathway; and IdnR appears to act as a positive regulator of the IdnR regulon, with l-idonate or 5-ketogluconate serving as the true inducer of the pathway. The l-idonate 5-dehydrogenase and 5-keto-d-gluconate 5-reductase reactions were characterized both chemically and biochemically by using crude cell extracts, and it was firmly established that these two enzymes allow for the redox-coupled interconversion of l-idonate and d-gluconate via the intermediate 5-ketogluconate. E. coli K-12 strains are able to utilize l-idonate as the sole carbon and energy source, and as predicted, the ability of idnD, idnK, idnR, and edd mutants to grow on l-idonate is altered.In Escherichia coli, the Entner-Doudoroff (ED) pathway serves as a metabolic “funnel” receiving intermediates formed by catabolism of several sugar acids (17). Hexuronic acids undergo rearrangement in the inducible Ashwell pathways (1) to form 2-keto-3-deoxygluconate, which is then phosphorylated to produce 2-keto-3-deoxy-6-phosphogluconate (KDPG). KDPG is cleaved by KDPG aldolase, encoded by eda, providing for entry of carbon into glycolysis. The other enzyme of the ED pathway is 6-phosphogluconate dehydratase, encoded by edd, which is induced only for catabolism of gluconate and also forms KDPG, the key intermediate of the ED pathway (7). Long considered to be of more significance than is readily obvious (9), the finding that eda and edd eda double mutants are unable to colonize the mouse large intestine underscores the possible ecological importance of ED metabolism (32). The implication from these colonization studies is that colonic mucus, which contains several sugar acids, may serve as an important source of nutrients for E. coli in the gut.Also participating in gluconate catabolism are several gluconate transporters and two gluconate kinases which appear, based upon their regulation, to comprise two distinct systems (2, 13). The GntI (main) system consists of gntT, gntU, and gntK, which code for high- and low-affinity gluconate transporters and a thermoresistant gluconate kinase, respectively (2325, 33). Expression of the GntR regulon, that is, GntI together with the edd-eda operon, is negatively controlled by the gntR gene product. The GntII (subsidiary) system is comprised of a thermosensitive gluconate kinase and a gluconate transporter which function for gluconate catabolism in the absence of the GntI system (2, 11, 13, 22). It appears that the subsidiary gluconate transporter, which has an apparent Km for gluconate of 60 μM (23), is encoded by a gene (idnT) which is adjacent to the gene encoding the thermosensitive gluconokinase (idnK) at 96.8 min.The DNA sequence of the GntII system genes, located at 4492 kb on the genome, was revealed by the E. coli Genome Project (5, 6). If the GntII system had evolved as a subsidiary pathway for gluconate catabolism, one would expect it to contain only a gluconate transporter and gluconate kinase. However, in addition to the divergent idnK and idnT genes, this region also encodes two “dehydrogenase-like” enzymes. The similarity of idnO to gno of Gluconobacter oxydans, which encodes d-gluconate:NADP 5-oxidoreductase (GNO) (15), led to the testing of ketogluconates as enzyme substrates for the two newly identified dehydrogenases. A process of deductive reasoning and biochemical experiments led to the conclusion that the GntII system in fact comprises a novel metabolic pathway for catabolism of l-idonic acid, in which gluconate is a key intermediate. Accordingly, the genes involved in l-idonate metabolism have been given the designation idn (see Table Table11 for gene nomenclature).

TABLE 1

Gene and enzyme nomenclaturea
Gene designation
Gene product% Identity of proteinb
PreviousNew (acces- sion no.)
gntVidnK (P39208)d-Gluconate kinase45 (GntKc)
yjgVidnD (P39346)l-Idonate 5-dehydrogenase30.6 (sheep DHSOd)
yjgUidnO (P39345)5-Keto-d-gluconate 5-reductase56 (GNOe)
gntWidnT (P39344)l-Idonate transporter61 (GntTf)
yjgSidnR (P39343)l-Idonate regulator46 (GntRg)
Open in a separate windowaAll accession numbers are Swiss-Prot database accession numbers. bPercent identity of the amino acid sequence of the Idn protein to that of the protein shown in parentheses. cE. coli gluconate kinase encoded by gntK (P46859). dSheep sorbitol dehydrogenase encoded by sorD (P07846). eG. oxydans gluconate:NADP 5-oxidoreductase encoded by gno (P50199). fE. coli gluconate transporter encoded by gntT (P39835). gE. coli gluconate regulator encoded by gntR (P46860). (Part of this work has been presented previously [3].)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号