首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
2.
β-Carotene biochemistry is a fundamental process in mammalian biology. Aberrations either through malnutrition or potentially through genetic variation may lead to vitamin A deficiency, which is a substantial public health burden. In addition, understanding the genetic regulation of this process may enable bovine improvement. While many bovine QTL have been reported, few of the causative genes and mutations have been identified. We discovered a QTL for milk β-carotene and subsequently identified a premature stop codon in bovine β-carotene oxygenase 2 (BCO2), which also affects serum β-carotene content. The BCO2 enzyme is thereby identified as a key regulator of β-carotene metabolism.THE metabolism of β-carotene to form vitamin A is nutritionally important, and vitamin A deficiency remains a significant public health burden. Genetic variation may underlie individual differences in β-carotene metabolism and contribute to the etiology of vitamin A deficiency. Within an agricultural species, genetic variation provides opportunity for production improvements, disease resistance, and product specialization options. We have previously shown that natural genetic variation can be successfully used to inform bovine breeding decisions (Grisart et al. 2002; Blott et al. 2003). Despite numerous reports of quantitative trait loci (QTL), few causative mutations have been identified. We discovered a QTL for milk β-carotene content and report here the identification of a mutation in the bovine β-carotene oxygenase 2 (BCO2) gene responsible for this QTL. The mutation, which results in a premature stop codon, supports a key role for BCO2 in β-carotene metabolism.The QTL trial consisted of a Holstein-Friesian × Jersey cross in an F2 design and a half-sibling family structure (Spelman et al. 2001). Six F1 sires and 850 F2 female progeny formed the trial herd. To construct the genetic map, the pedigree (including the F1 sires, F1 dams, F2 daughters, and selected F0 grandsires: n = 1679) was genotyped, initially with 237 microsatellite markers, and subsequently, with 6634 SNP markers (Affymetrix Bovine 10K SNP GeneChip). A wide range of phenotypic measures relating to growth and development, health and disease, milk composition, fertility, and metabolism were scored on the F2 animals from birth to 6 years of age.To facilitate the discovery of QTL and genes regulating β-carotene metabolism, milk concentration of β-carotene was measured during week 6 of the animals'' second lactation (n = 651). Using regression methodology in a half-sib model (Haley et al. 1994; Baret et al. 1998), a QTL on bovine chromosome 15 (P < 0.0001; Figure 1A) was discovered. The β-carotene QTL effect on chromosome 15 was also significant (P < 0.0001) at two additional time points, in months 4 and 7 of lactation. Three of the six F1 sire families segregated for the QTL, suggesting that these three F1 sires would be heterozygous for the QTL allele (“Q”). To further define the most likely region within the QTL that would harbor the causative mutation, we undertook association mapping, using the 225 SNP markers that formed the chromosome 15 genetic map (Figure 1A). One SNP (“PAR351319”) was more closely associated with the β-carotene phenotype than any other marker (P = 2.522E−18). This SNP was located beneath the QTL peak. Further, the SNP was heterozygous in the three F1 sires that segregated for the QTL, and homozygous in the remaining three sires. On this basis, we hypothesized that the milk β-carotene phenotype would differ between animals on the basis of the genotype of SNP PAR351319.Open in a separate windowFigure 1.—Discovery of BCO2 mutation affecting milk β-carotene concentration. (A) The β-carotene QTL on bovine chromosome 15 (P < 0.0001) is shown by the red line. The maximum F-value at 21 cM was 7.15. The 95% confidence interval is shown by the shaded box. The association of each marker with milk β-carotene is shown by the blue dots, and the association of the BCO2 genotype is shown by the green diamond. A total of 233 informative markers (8 microsatellite markers and 225 single nucleotide polymorphisms) were included on the genetic map for BTA15. QTL detection was conducted using regression methodology in a line of descent model (Haley et al. 1994) and a half-sib model (Baret et al. 1998). Threshold levels were determined at the chromosomewide level using permutation testing (Churchill and Doerge 1998) and confidence intervals estimated using bootstrapping (Visscher et al. 1996). (B) The haplotypes of 10 representative animals for “QQ” and “qq” are shown for the SNP markers encompassing the SNP (“PAR351319”) most closely associated with the milk β-carotene phenotype. Light and dark gray boxes represent homozygous SNPs, while white boxes represent heterozygous SNPs. The genes present within the defined region are also shown. (C) The mutation in the bovine BCO2 gene is shown. The structure of the BCO2 gene is indicated by the horizontal bar, with vertical bars representing exons 1–12. The A > G mutation in exon 3 (red) causes a premature termination codon at amino acid position 80. (D) The mean concentration of β-carotene in the milk fat of “QQ,” “Qq,” and “qq” cows is shown. β-Carotene was measured by absorbance at 450 nm as previously described (Winkelman et al. 1999). Data are means ± SEM. The statistical significance was determined using ANOVA (***P < 0.0001; n = 651).We then made the following assumptions: that the effect of the QTL was additive, that the Q allele was present in the dam population, allowing the occurrence of homozygous (“QQ”) offspring, and that the QTL was caused by a single mutation, acting with a dominant effect on the milk β-carotene phenotype. Haplotypes encompassing the PAR351319 SNP were determined in the F2 offspring. A comparison of the phenotypic effect of homozygous Q, heterozygous and homozygous q individuals revealed that indeed, animals with the “QQ” genotype had a higher concentration of milk β-carotene than animals with the “qq” genotype (Figure 1D). We predicted that the region of homozygosity was likely to contain the causative gene and mutation. The extent of this region and the candidate genes contained within it are shown in Figure 1B. A total of 10 genes with known function, including BCO2, were located within the region. This information, combined with knowledge of the role BCO2 plays in β-carotene metabolism in other species (Kiefer et al. 2001), made BCO2 a good positional candidate for the QTL. We therefore sequenced the entire coding region (12 exons, NC_007313.3) of the BCO2 gene in each of the six F1 sires. An A > G mutation, which was heterozygous in the three F1 sires that segregated for the QTL, was discovered in exon three, 240 bp from the translation initiation site (Figure 1C). The three remaining sires were homozygous for the G allele, which encodes the 530-amino-acid BCO2 protein (NP_001101987). The A allele creates a premature stop codon resulting in a truncated protein of 79 amino acids. To determine whether this mutation was associated with the QTL, the remainder of the pedigree was genotyped. The BCO2 genotype was significantly associated with the milk β-carotene phenotype (P = 8.195E−29) The AA genotype (referred to as BCO2−/−) was present in 3.4% (n = 28) of the F2 population. The AG and GG genotypes (subsequently referred to as BCO2−/+ and BCO2+/+, respectively) were present in 32.8% (n = 269) and 63.8% (n = 523), respectively, of the F2 population.The effect of the premature stop codon on milk β-carotene content was striking. BCO2−/− cows produced milk with 78 and 55% more β-carotene than homozygous (GG) and heterozygous (AG) wild-type animals, respectively (P < 0.0001; Figure 2A). Consequently, the yellow color of the milk fat varied greatly (Figure 2B). The genotype effect on milk β-carotene content was similar at the other two time points measured during lactation (78 and 68% more β-carotene in milk from BCO2−/− cows compared to BCO2+/+ cows; data not shown).Open in a separate windowFigure 2.—Effect of BCO2 genotype on milk β-carotene content. (A) The mean concentration of β-carotene in the milk fat of BCO2−/−, BCO2−/+, and BCO2+/+ cows is shown. β-Carotene was measured by absorbance at 450 nm as previously described (Winkelman et al. 1999). Data are means ± SEM. The statistical significance was determined using ANOVA (***P < 0.0001; n = 651). (B) The effect of the BCO2 genotype on milk fat color is illustrated.No adverse developmental or health affects as a result of the A allele were observed at any stage throughout the lifespan of the animals. The BCO2−/− cows were fertile and milk yield was normal throughout lactation. Interestingly, quantitative real-time PCR showed fourfold lower levels of the BCO2 mRNA in liver tissue from BCO2−/− cows (data not shown).β-Carotene and vitamin A (retinol) concentrations were also measured in serum, liver, and adipose tissue samples, and vitamin A concentration was measured in milk samples from 14 F2 cows of each genotype. Serum β-carotene concentration was higher in BCO2−/− cows compared to the heterozygous and homozygous wild-type cows (P = 0.003; Figure 3A). Thus, the effect of the mutation on β-carotene concentration was similar for both milk and serum, showing that this effect was not confined to the mammary gland. Vitamin A concentration was higher in serum from BCO2−/− cows (P = 0.001; Figure 3B); however, the concentration did not differ in milk (13.1 μg/g fat vs. 14.1 μg/g fat for BCO2−/− and BCO2+/+ cows, respectively; P > 0.1). Liver β-carotene concentration did not differ between genotype groups (Figure 3C), but liver vitamin A was lower in BCO2−/− cows compared to BCO2+/+ cows (P < 0.03; Figure 3D). β-Carotene and vitamin A concentration did not differ between the genotype groups in adipose tissue (data not shown), suggesting tissue-specific effects of the BCO2 enzyme.Open in a separate windowFigure 3.—Effect of the BCO2 genotypes on concentration of β-carotene (A and C), and retinol (B and D), in serum (A and B), and liver (C and D). Subcutaneous adipose tissue biopsies (∼500 mg tissue), liver biopsies (∼100 mg tissue), and serum samples (10 ml) were taken from a subset of 42 cows (14 animals each BCO2−/−, BCO2−/+, and BCO2+/+ genotypes). β-Carotene and retinol measurements were determined using HPLC with commercial standards, on the basis of a published method (Hulshof et al. 2006). Data shown are means ± SEM. Significant differences are indicated by asterisks (*P < 0.05; **P < 0.01; ANOVA, n = 14 per genotype).While previous studies have shown a key role for β-carotene 15, 15′ monooxygenase (BCMO1) in catalyzing the symmetrical cleavage of β-carotene to vitamin A (von Lintig and Vogt 2000; von Lintig et al. 2001; Hessel et al. 2007) similar evidence for the role of the BCO2 enzyme in β-carotene metabolism is lacking. The physiological relevance of BCO2 has therefore been a topic of debate (Wolf 1995; Lakshman 2004; Wyss 2004). BCO2 mRNA and protein have been detected in several human tissues (Lindqvist et al. 2005), and the in vitro cleavage of β-carotene to vitamin A has been demonstrated (Kiefer et al. 2001; Hu et al. 2006). Our results provide in vivo evidence for BCO2-mediated conversion of β-carotene to vitamin A. BCO2−/− cows had more β-carotene in serum and milk and less vitamin A in liver, the main storage site for this vitamin.Our results show that a simple genetic test will allow the selection of cows for milk β-carotene content. Thus, milk fat color may be increased or decreased for specific industrial applications. Market preference for milk fat color varies across the world. Further, β-carotene enriched dairy foods may assuage vitamin A deficiency. Milk may be an ideal food for delivery of β-carotene, which is fat soluble and most efficiently absorbed in the presence of a fat component (Ribaya-Mercado 2002).In conclusion, we have discovered a naturally occurring premature stop codon in the bovine BCO2 gene strongly suggesting a key role of BCO2 in β-carotene metabolism. This discovery has industrial applications in the selection of cows producing milks with β-carotene content optimized for specific dairy products or to address a widespread dietary deficiency. More speculatively, it would be interesting to investigate possible effects of BCO2 variation in humans on the etiology of vitamin A deficiency.  相似文献   

3.
4.
We consider neutral evolution of a large population subject to changes in its population size. For a population with a time-variable carrying capacity we study the distribution of the total branch lengths of its sample genealogies. Within the coalescent approximation we have obtained a general expression—Equation 20—for the moments of this distribution with a given arbitrary dependence of the population size on time. We investigate how the frequency of population-size variations alters the total branch length.MODELS for gene genealogies of biological populations often assume a constant, time-independent population size N. This is the case for the Wright–Fisher model (Fisher 1930; Wright 1931), for the Moran model (Moran 1958), and for their representation in terms of the coalescent (Kingman 1982). In real biological populations, by contrast, the population size changes over time. Such fluctuations may be due to catastrophic events (bottlenecks) and subsequent population expansions or just reflect the randomness in the factors determining the population dynamics. Many authors have argued that genetic variation in a population subject to size fluctuations may nevertheless be described by the Wright–Fisher model, if one replaces the constant population size in this model by an effective population size of the form(1)where Nl stands for the population size in generation l. The harmonic average in Equation 1 is argued to capture the significant effect of catastrophic events on patterns of genetic variation in a population: if, for example, a population went through a recent bottleneck, a large fraction of individuals in a given sample would originate from few parents. This in turn would lead to significantly reduced genetic variation, parameterized by a small value of Neff. (See, e.g., Ewens 1982 for a review of different measures of the effective population size and Sjödin et al. 2005 and Wakeley and Sargsyan 2009 for recent developments of this concept.)The concept of an effective population size has been frequently used in the literature, implicitly assuming that the distribution of neutral mutations in a large population of fluctuating size is identical to the distribution in a Wright–Fisher model with the corresponding constant effective population size given by Equation 1. However, recently it was shown that this is true only under certain circumstances (Kaj and Krone 2003; Nordborg and Krone 2003; Jagers and Sagitov 2004). It is argued by Sjödin et al. (2005) that the concept of an effective population size is appropriate when the timescale of fluctuations of Nl is either much smaller or much larger than the typical time between coalescent events in the sample genealogy. In these limits it can be proved that the distribution of the sample genealogies is exactly given by that of the coalescent with a constant, effective population size.More importantly, it follows from these results that, in populations with variable size, the coalescent with a constant effective population size is not always a valid approximation for the sample genealogies. Deviations between the predictions of the standard coalescent model and empirical data are frequently observed, and there are a number of different statistical tests quantifying the corresponding discrepancies (see, for example, Tajima 1989, Fu and Li 1993, and Zeng et al. 2006). The analysis of such deviations is of crucial importance in understanding, for example, human genetic history (Garrigan and Hammer 2006). But while there is a substantial amount of work numerically quantifying deviations, often in terms of a single number, little is known about their qualitative origins and their effect upon summary statistics in the population in question.The question is thus to understand the effect of population-size fluctuations on the patterns of genetic variation, in particular for the case where the scale of the population-size fluctuations is comparable to the time between coalescent events in the ancestral tree. As is well known, many empirical measures of genetic variation can be computed from the total branch length of the sample genealogy (the expected number of single-nucleotide polymorphisms, for example, is proportional to the average total branch length).The aim of this article is to analyze the distribution of the scaled total branch length Tn for a sample genealogy in a population of fluctuating size, as illustrated in Figure 1. For the genealogy of n ≥ 2 lineages sampled at the present time, the expression ⌊NTn⌋ gives the total branch length in terms of generations. Here ⌊Nt⌋ is the largest integer ≤Nt, and the scaling factor N is a suitable measure of the number of genes in the population and serves as a counterpart of the constant generation size of the standard Wright–Fisher model.Open in a separate windowFigure 1.—The effect of population-size oscillations on the genealogy of a sample of size n = 17 (schematic). Left, genealogy described by Kingman''s coalescent for a large population of constant size, illustrated by the light blue rectangle; right, sinusoidally varying population size. Coalescence is accelerated in regions of small population sizes and vice versa. This significantly alters the tree and gives rise to changes in the distribution of the number of mutations and of the population homozygosity.A motivating example is given in Figure 2, which shows numerically computed distributions ρ(Tn) of the total branch lengths Tn for a particular population model with a time-dependent carrying capacity. The model is described briefly in the Figure 2 legend and in detail in a model for a population with time-dependent carrying capacity. As Figure 2 shows, the distributions depend in a complex manner on the form of the size changes. We observe that when the frequency of the population-size fluctuations is very small (Figure 2a), the distribution is well described by the standard coalescent result(2)(Hein et al. 2005). When the frequency is very large (Figure 2e), Equation 2 also applies, but with a different time scaling reflecting an effective population size: t on the right-hand side (rhs) in Equation 2 is replaced by t/c with c = N/Neff. Apart from these special limits, however, the form of the distributions appears to depend in a complicated manner upon the frequency of the population-size variation. The observed behavior is caused by the fact that coalescence proceeds faster for smaller population sizes and more slowly for larger population sizes, as illustrated in Figure 1. But the question is how to quantitatively account for the changes shown in Figure 2.Open in a separate windowFigure 2.—Numerically computed distributions of the scaled total branch lengths Tn in genealogies of samples of size n = 10. The model employed in the simulations is outlined in a model for a population with time-dependent carrying capacity. It describes a population subject to a time-varying carrying capacity, Kl = K0(1 + ɛ sin(2πνl)). The frequency of the time changes is determined by ν, and l = 1, 2, 3, … labels discrete generations forward in time. The parameter N = K0 describes the typical population size, which is taken here to be equal to the time-averaged carrying capacity. a–e show for populations with increasingly rapidly oscillating carrying capacity. The dashed red line in a shows that in the limit of low frequencies the standard coalescent result, Equation 2, is obtained. The dashed red line in e shows that also in the limit of large frequencies the standard coalescent result is obtained, but now with an effective population size. The dashed red line in d is a two-parameter distribution, Equation 41, derived in comparison between numerical simulations and coalescent predictions. Further numerical and analytical results on the frequency dependence of the moments of these distributions are shown in Figure 4. Parameter values used: K0 = 10,000, ɛ = 0.9, and r = 1 (see a model for a population with time-dependent carrying capacity for the exact meaning of the intrinsic growth rate r) and (a) νN = 0.001, (b) νN = 0.1, (c) νN = 0.316, (d) νN = 1, and (e) νN = 100.We show in this article that the results of the simulations displayed in Figure 2 are explained by a general expression—Equation 20—for the moments of the distributions shown in Figure 2. Our general result is obtained within the coalescent approximation valid in the limit of large population size. But we find that in most cases, the coalescent approximation works very well down to small population sizes (a few hundred individuals). Our result enables us to understand and quantitatively describe how the distributions shown in Figure 2 depend upon the frequency of the population-size oscillations. It makes possible to determine, for example, how the variance, skewness, and the kurtosis of these distributions depend upon the frequency of demographic fluctuations. This in turn allows us to compute the population homozygosity and to characterize genetic variation in populations with size fluctuations.The remainder of this article is organized as follows. The next section summarizes our analytical results for the moments of the total branch length. Following that, we describe the model employed in the computer simulations. Then, corresponding numerical results are compared to the analytical predictions. And finally, we summarize how population-size fluctuations influence the distribution of total branch lengths and conclude with an outlook.  相似文献   

5.
The DNA-packaging specificities of phages λ and 21 depend on the specific DNA interactions of the small terminase subunits, which have support helix-turn-recognition helix-wing DNA-binding motifs. λ-Terminase with the recognition helix of 21 preferentially packages 21 DNA. This chimeric terminase''s ability to package λDNA is reduced ∼20-fold. Phage λ with the chimeric terminase is unable to form plaques, but pseudorevertants are readily obtained. Some pseudorevertants have trans-acting suppressors that change codons of the recognition helix. Some of these codons appear to remove an unfavorable base-pair contact; others appear to create a novel nonspecific DNA contact. Helper-packaging experiments show that these mutant terminases have lost the ability to discriminate between λ and 21 during DNA packaging. Two cis-acting suppressors affect cosB, the small subunit''s DNA-binding site. Each changes a cosBλ-specific base pair to a cosB21-specific base pair. These cosB suppressors cause enhanced DNA packaging by 21-specific terminase and reduce packaging by λ-terminase. Both the cognate support helix and turn are required for strong packaging discrimination. The wing does not contribute to cosB specificity. Evolution of packaging specificity is discussed, including a model in which λ- and 21-packaging specificities diverged from a common ancestor phage with broad packaging specificity.VIRUSES must package viral chromosomes from nucleic acid pools that include host-cell nucleic acids, so specific recognition of the viral nucleic acid is essential during virion assembly. For large DNA viruses, including the tailed double-strand DNA (dsDNA) bacteriophages, the herpesviruses, and the adenoviruses, DNA-packaging proteins recognize specific sequences on the viral chromosomes (reviewed in Baines and Weller 2005 and Ostapchuk and Hearing 2005, respectively). For the dsDNA viruses that produce virion chromosomes by processing concatemeric DNA, a viral terminase enzyme functions in the recognition and cutting of concatemeric DNA and subsequently sponsors DNA translocation. λ-Terminase is a heterooligomer of large and small subunits, gpA and gpNu1, respectively. Cutting of concatemeric DNA is carried out by gpA''s endonuclease activity (Becker and Gold 1978; Davidson and Gold 1992; Hwang and Feiss 1996). Three DNA subsites, cosQ, cosN, and cosB, are contained in the ∼200-bp-long cos site and orchestrate DNA packaging through interactions with terminase (Figure 1A; reviewed in Feiss and Catalano 2005). gpA introduces staggered nicks in cosN to generate the 12-bp cohesive ends of mature λDNA molecules. Efficient and accurate nicking of cosN requires anchoring of gpA by gpNu1, which binds to the adjacent cosB subsite (Higgins and Becker 1994b; Hang et al. 2001).Open in a separate windowFigure 1.—The cos and terminase region of the λ-chromosome. (A) (Top) Map of cos and the terminase-encoding Nu1 and A genes. The black bar indicates the location of the winged helix-turn-helix DNA-binding motifs in the N-terminal domain of gpNu1. (Bottom) cos subsites: cosQ is required for termination of DNA packaging; cosN is the site where the large terminase subunit, gpA, introduces staggered nicks to generate the cohesive ends of virion DNA molecules; and cosB contains the gpNu1-binding sites R1, R2, and R3 along with the IHF-binding site I1. (B) (Top) Schematic of gpNu1 residues 1–42, including the support (blue) and recognition (red) α-helixes and the wing loop (magenta). β1 and β2 are short β-strands flanking the DNA-binding elements. (Bottom) Sequences are a comparison of residues of λ''s gpNu1 and phage 21''s gp1, with conserved resides indicated by vertical lines. Note that the recognition helixes of gpNu1 and gp1 differ by four residues, all likely solvent-exposed (Becker and Murialdo 1990; de Beer et al. 2002). (C) Three-dimensional structure of the winged helix-turn-helix-containing, N-terminal domain of gpNu1 (residues 1–68) (de Beer et al. 2002). Side groups of solvent-exposed residues of the recognition helix are displayed. Color coded as in B.λ''s cosB (cosBλ) is a complex subsite containing three copies of a gpNu1-binding sequence, the R sequence, plus a site, I1, for the integration host factor (IHF), the Escherichia coli DNA-bending protein. The order of sites is cosN–R3–I1–R2–R1. The amino-terminal half of gpNu1 contains a winged helix-turn-helix DNA-binding motif (Figure 1, B and C; Gajiwala and Burley 2000) that interacts with the R sequences. Further, the amino-terminal domain of gpNu1 is a tight dimer (Figure 1C, de Beer et al. 2002). The IHF-induced bend at I1 creates a DNA hairpin in cosB that positions the major grooves of R3 and R2 to face inward, so that the helix-turn-helix motifs of dimeric gpNu1 can be docked into them. The wing loops are positioned to make minor groove contacts with R3 and R2. Thus it is proposed that gpA is positioned to nick cosN by assembly of a bent structure with dimeric gpNu1 bound to R3 and R2 (Becker and Murialdo 1990; de Beer et al. 2002). A variety of studies indicate that the positioning of gpNu1 at R3 is crucial and that the other interactions function to create and/or stabilize the R3–gpNu1 interaction (Cue and Feiss 1993a; Higgins and Becker 1994a; Hang et al. 2001).DNA packaging initiates when terminase binds and nicks a cos. Following cosN nicking and separation of the cohesive ends, terminase remains bound to the cosB-containing chromosome end (Becker et al. 1977; Yang et al. 1997). The DNA-bound terminase docks on the portal vertex of a prohead, the empty, immature virion head shell. Assembly of the ternary prohead–terminase–DNA complex activates gpA''s potent translocation ATPase, and the viral DNA is translocated into the prohead (Yang and Catalano 2003; Dhar and Feiss 2005). Translocation brings the next cos along the concatemer to the portal-docked terminase (Feiss and Widner 1982). The downstream cos is cleaved by terminase, completing packaging of the chromosome. Recognition of the downstream cos requires cosQ and cosN (Cue and Feiss 2001). Following DNA packaging, terminase undocks from the filled head. Attachment of a tail to the DNA-filled head completes virion assembly. The undocked terminase remains bound to and sponsors the packaging of the next chromosome along the concatemer.The interactions between the recognition helix of gpNu1 and an R sequence are typical for helix-turn-helix proteins, as shown by genetic studies of chimeras between λ and its relative, phage 21, as follows: λ and 21 have similarly organized cos sites; the cosB of 21 also has the R3–I1–R2–R1 structure. Nevertheless, the two phages have distinct packaging specificities. Base-pair differences in the R sequences account for packaging specificity (Becker and Murialdo 1990; Smith and Feiss 1993). cosN and cosQ are interchangeable between λ and 21 (Feiss et al. 1981). The consensus R sequences are 5′-CGTTTCCtTTCT-3′ for cosBλ and 5′-CaTGTCGGncCT-3′ for cosB21, where capitalized residues are conserved in all three R sequences of both phages; underlined and capitalized are two residues conserved in all three R sequences of both phages, but which differ between cosBλ and cosB21 (Becker and Murialdo 1990). These two conserved but phage-specific base pairs are likely to be of major importance for specificity. Similarly, the recognition helixes of the helix-turn-helix motifs of the small subunits of λ (gpNu1) and 21 (gp1) terminases differ in four amino acid residues that account for packaging specificity (Figure 1; Becker and Murialdo 1990).In earlier work (de Beer et al. 2002), we showed that modifying λ-terminase by replacing the gpNu1 recognition helix with that of 21''s gp1 created a terminase (gpNu1hy1 terminase) that was specific for the cosB of phage 21 (designated cosB21). That is, λ cosB21 Nu1hy1 was viable, but λ cosBλ Nu1hy1 was inviable due to the specificity mismatch between cosBλ and the cosB21-specific recognition helix of the chimeric small terminase subunit, gpNu1hy1. The Nu1hy1 terminase packages cosB21 chromosomes ∼10-fold more efficiently than it does cosBλ chromosomes. This 10-fold discrimination between cosB21 and cosBλ chromosomes is much weaker than the >104-fold discrimination shown by wild-type λ and 21 terminases (de Beer et al. 2002). Because of the modest discrimination of Nu1hy1 terminase, the yield of λ cosBλ Nu1hy1 is only slightly below the yield required for plaque formation. Lysates of λ cosBλ Nu1hy1 contain plaque-forming pseudorevertants at a level expected for single mutations. A number of these pseudorevertants were sequenced and found to contain mutations in cosBλ or in the Nu1hy1 gene. Here we report on in vivo packaging studies on the effects of these Nu1hy1 and cosBλ suppressor mutations on packaging specificity.  相似文献   

6.
A new β-glucosidase from a novel strain of Terrabacter ginsenosidimutans (Gsoil 3082T) obtained from the soil of a ginseng farm was characterized, and the gene, bgpA (1,947 bp), was cloned in Escherichia coli. The enzyme catalyzed the conversion of ginsenoside Rb1 {3-O-[β-d-glucopyranosyl-(1-2)-β-d-glucopyranosyl]-20-O-[β-d-glucopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol} to the more pharmacologically active rare ginsenosides gypenoside XVII {3-O-β-d-glucopyranosyl-20-O-[β-d-glucopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol}, gypenoside LXXV {20-O-[β-d-glucopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol}, and C-K [20-O-(β-d-glucopyranosyl)-20(S)-protopanaxadiol]. A BLAST search of the bgpA sequence revealed significant homology to family 3 glycoside hydrolases. Expressed in E. coli, β-glucosidase had apparent Km values of 4.2 ± 0.8 and 0.14 ± 0.05 mM and Vmax values of 100.6 ± 17.1 and 329 ± 31 μmol·min−1·mg of protein−1 against p-nitrophenyl-β-d-glucopyranoside and Rb1, respectively. The enzyme catalyzed the hydrolysis of the two glucose moieties attached to the C-3 position of ginsenoside Rb1, and the outer glucose attached to the C-20 position at pH 7.0 and 37°C. These cleavages occurred in a defined order, with the outer glucose of C-3 cleaved first, followed by the inner glucose of C-3, and finally the outer glucose of C-20. These results indicated that BgpA selectively and sequentially converts ginsenoside Rb1 to the rare ginsenosides gypenoside XVII, gypenoside LXXV, and then C-K. Herein is the first report of the cloning and characterization of a novel ginsenoside-transforming β-glucosidase of the glycoside hydrolase family 3.Ginseng refers to the roots of members of the plant genus Panax, which have been used as a traditional medicine in Asian countries for over 2,000 years due to their observed beneficial effects on human health. Ginseng saponins, also referred to as ginsenosides, are the major active components of ginseng (27). Various biological activities have been ascribed to ginseng saponins, including anti-inflammatory activity (43), antitumor effects (23, 39), and neuroprotective and immunoprotective (15, 31) effects.Ginsenosides can be categorized as protopanaxadiol (PPD), protopanaxatriol, and oleanane saponins, based on the structure of the aglycon, with a dammarane skeleton (29). The PPD-type ginsenosides are further classified into subgroups based on the position and number of sugar moieties attached to the aglycon at positions C-3 and C-20. For example, one of the largest PPD-type ginsenosides, Rb1 {3-O-[β-d-glucopyranosyl-(1-2)-β-d-glucopyranosyl]-20-O-[β-d-glucopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol}, contains 4 glucose moieties, two each attached via glycosidic linkages to the C-3 and C-20 positions of the aglycon (Fig. (Fig.11).Open in a separate windowFIG. 1.Chemical structures of protopanaxadiol and protopanaxatriol ginsenosides (5). The ginsenosides represented here are all (S)-type ginsenosides. glc, β-d-glucopyranosyl; arap, α-l-arabinopyranosyl; araf, α-l-arabinofuranosyl; rha, α-l-rhamnopyranosyl; Gyp, gypenoside; C, compound.Because of their size, low solubility, and poor permeability across the cell membrane, it is difficult for human body to directly absorb large ginsenosides (44), although these components constitute the major portion of the total ginsenoside in raw ginseng (30). Moreover, the lack of the availability of the rare ginsensoides limits the research on their biological and medicinal properties. Therefore, transformation of these major ginsenosides into smaller deglycosylated ginsenosides, which are more effective in in vivo physiological action, is required (1, 37).The production of large amounts of rare ginsenosides from the major ginsenosides can be accomplished through a number of physiochemical methods such as heating (17), acid treatment (2), and alkali treatment (48). However, these approaches produce nonspecific racemic mixtures of rare ginsenosides. As an alternative, enzymatic methods have been explored as a way to convert the major ginsenosides into more pharmacologically active rare ginsenosides in a more specific manner (14, 20).To date, three types of glycoside hydrolases, β-d-glucosidase, α-l-arabinopyranosidase, and α-l-arabinofuranosidase, have been found to be involved in the biotransformation of PPD-type ginsenosides. For example, a β-glucosidase isolated from a fungus converts Rb1 to C-K [20-O-(β-d-glucopyranosyl)-20(S)-protopanaxadiol] (45), and an α-l-arabinopyranosidase and α-l-arabinofuranosidase have been isolated from an intestinal bacterium that hydrolyze, respectively, Rb2 {3-O-[β-d-glucopyranosyl-(1-2)-β-d-glucopyranosyl]-20-O-[α-l-arabinopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol} to Rd {3-O-[β-d-glucopyranosyl-(1-2)-β-d-glucopyranosyl]-20-O-β-d-glucopyranosyl-20(S)-protopanaxadiol} and Rc {3-O-[β-d-glucopyranosyl-(1-2)-β-d-glucopyranosyl]-20-O- [α-l-arabinofuranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol} to Rd (34). Two recombinant enzymes that convert major ginsenosides into rare ginsenosides have been cloned and expressed in Escherichia coli: Solfolobus solfataricus β-glycosidase, which transforms Rb1 or Rc to C-K (28), and β-glucosidase from a soil metagenome, which transforms Rb1 to Rd (16). Both of these glycoside hydrolases are family 1 glycoside hydrolases.Here, we report the cloning and expression in E. coli of a gene (bgpA) encoding a new ginsenoside-hydrolyzing β-glucosidase from a novel bacterial strain, Terrabacter ginsenosidimutans sp. nov. Gsoil 3082, isolated from a ginseng farm in Korea. BgpA is a family 3 glycoside hydrolase, and the recombinant enzyme employs a different enzymatic pathway from ginsenoside-hydrolyzing family 1 glycoside hydrolases. BgpA preferentially and sequentially hydrolyzed the terminal and inner glucoses at the C-3 position of ginsenoside Rb1 and then the outer glucose at the C-20 position. Thus, BgpA could be effective in the biotransformation of ginsenoside Rb1 to gypenoside (Gyp) XVII {3-O-β-d-glucopyranosyl-20-O-[β-d-glucopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol}, Gyp LXXV {20-O-[β-d-glucopyranosyl-(1-6)-β-d-glucopyranosyl]-20(S)-protopanaxadiol}, and C-K.  相似文献   

7.
Epigenetically inherited aggregates of the yeast prion [PSI+] cause genomewide readthrough translation that sometimes increases evolvability in certain harsh environments. The effects of natural selection on modifiers of [PSI+] appearance have been the subject of much debate. It seems likely that [PSI+] would be at least mildly deleterious in most environments, but this may be counteracted by its evolvability properties on rare occasions. Indirect selection on modifiers of [PSI+] is predicted to depend primarily on the spontaneous [PSI+] appearance rate, but this critical parameter has not previously been adequately measured. Here we measure this epimutation rate accurately and precisely as 5.8 × 10−7 per generation, using a fluctuation test. We also determine that genetic “mimics” of [PSI+] account for up to 80% of all phenotypes involving general nonsense suppression. Using previously developed mathematical models, we can now infer that even in the absence of opportunities for adaptation, modifiers of [PSI+] are only weakly deleterious relative to genetic drift. If we assume that the spontaneous [PSI+] appearance rate is at its evolutionary optimum, then opportunities for adaptation are inferred to be rare, such that the [PSI+] system is favored only very weakly overall. But when we account for the observed increase in the [PSI+] appearance rate in response to stress, we infer much higher overall selection in favor of [PSI+] modifiers, suggesting that [PSI+]-forming ability may be a consequence of selection for evolvability.THE yeast phenotype [PSI+] is characterized by prion aggregates of the protein Sup35. Cells are in either a [psi−] (normal) or [PSI+] state, depending on the absence or presence of the prion aggregates (Figure 1, a and b). Sup35 prion aggregates replicate in a similar fashion to mammalian prions but are cytoplasmic and, as such, the prion state is cytoplasmically inherited (Wickner et al. 1995).Open in a separate windowFigure 1.—Comparison between the three possible modes ([PSI+], genetic mimic, point mutation revertant) of the expression of 3′-UTR sequences in yeast. (a) The normal [psi−] phenotypic state; (b) the [PSI+] prion causes readthrough and low-level expression of 3′-UTRs across multiple genes, appearing at rate mPSI; (c) a genetic mimic of [PSI+] such as the sal3-4 mutant of Sup35 (Eaglestone et al. 1999) appearing at rate mmimic not reversible by the application of guanidine hydrochloride; (d) a point mutation in a single stop codon at rate μpoint, leading to incorporation of formerly 3′-UTR into a single coding sequence. (e) [PSI+] can act as a “stop-gap” mechanism, buying a lineage more time to acquire one or more adaptive stop codon readthrough point mutations. When this genetic assimilation is complete, [PSI+] can revert to [psi−] (Masel and Bergman 2003; Griswold and Masel 2009).When not part of an aggregate, Sup35 helps mediate translation termination in yeast (Stansfield et al. 1995b; Zhouravleva et al. 1995). Sup35 molecules that are incorporated into nonfunctional prion aggregates are presumably not available for translation termination, which can lead to the translation of stop codons by near-cognate tRNAs (Figure 1b) (Tuite and Mclaughlin 1982; Pure et al. 1985; Lin et al. 1986). This partial loss of Sup35 function leads to an increased frequency of readthrough translation of 3′-untranslated regions (3′-UTR) across all genes (Figure 1b). This increase is modest in wild-type yeast, from an average readthrough rate of 0.3% in [psi−] cells up to 1% in [PSI+] cells (Firoozan et al. 1991). Some [PSI+] yeast strains grow faster than [psi−] controls in certain harsh environments, suggesting that readthrough translation of some 3′-UTRs may be adaptive in certain conditions (True and Lindquist 2000; Joseph and Kirkpatrick 2008). This directly shows that [PSI+]-mediated capacitance may increase evolvability in the laboratory. [PSI+]-mediated phenotypes have a complex genetic basis, involving multiple loci (True et al. 2004).As an epigenetically inherited protein aggregate, [PSI+] can easily be lost after some generations (Cox et al. 1980). This returns the lineage to its normal [psi−] state and restores translation fidelity. If a subset of revealed phenotypic variation is adaptive, it may have lost its dependence on [PSI+] by this time (True et al. 2004). This process of genetic assimilation may, for example, involve one or more point mutations in stop codons, increasing readthrough up to 100% (Figure 1e) (Griswold and Masel 2009). This leaves the yeast with a new adaptive trait and with no permanent load of other, deleterious variation.In general, stop codons can be lost either directly through point mutations or indirectly through upstream indels. This leads to novel coding sequence coming from in-frame and out-of-frame 3′-UTRs, respectively. [PSI+] is expected to facilitate only the former, while mutation bias favors the latter. Yeasts show a much higher ratio of in-frame to out-of-frame 3′-UTR incorporation events than mammals do (Giacomelli et al. 2007), confirming a role for [PSI+] in capacitance-mediated evolvability in natural populations.The adaptive evolution both of evolvability in general (Sniegowski and Murphy 2006; Lynch 2007; Pigliucci 2008) and of capacitance in particular (Dickinson and Seger 1999; Wagner et al. 1999; Partridge and Barton 2000; Brookfield 2001; Pal 2001; Meiklejohn and Hartl 2002; Ruden et al. 2003) is highly controversial. In general, any costs of evolvability are borne in the present, while the benefits lie in the future, making it difficult for natural selection to favor an evolvability allele. For example, mutation rates seem to be set according to a trade-off between metabolic cost (favoring higher mutation rates) and the avoidance of deleterious effects (favoring lower mutation rates) (Sniegowski et al. 2000). The fact that mutation creates variation, the ultimate source of evolvability, is merely a fortuitous consequence of the metabolic cost of fidelity.Previous theoretical population genetic studies have, however, suggested that modifier alleles promoting the formation of [PSI+] might, unlike mutator alleles, be favored for their evolvability properties (King and Masel 2007; Masel et al. 2007; Griswold and Masel 2009; Masel and Griswold 2009). These models depend, however, on a number of parameter estimates. In particular, a number of predictions depend on the spontaneous rate of [PSI+] formation (Masel and Griswold 2009).

[PSI+] appearance rates and the fluctuation test:

The most widely cited spontaneous appearance rate for [PSI+] is mPSI ∼ 10−7–10−5, on the basis of experiments by Lund and Cox (1981). This estimate was calculated as the proportion of colonies scored as [PSI+] after growth over multiple generations from a single founding [psi−] clone. If [PSI+] happens to appear in the first generation of growth, this leads to a “jackpot” event with only one switching event, but many [PSI+] colonies. The proportion of colonies scored as [PSI+] therefore yields a systematic overestimation of the [PSI+] appearance rate.Various implementations of the fluctuation test (Luria and Delbrück 1943) can address such effects. The mutation rate experiment is replicated many times using independent populations, and a Luria–Delbrück distribution is fitted to the results across all replicates. In a simulation study, Stewart (1994) examined a number of estimators of the underlying Luria–Delbrück distribution and found that the maximum-likelihood estimator performed the best.Originally developed to study mutation rates, the fluctuation test can also be used for estimating epimutation rates. Fluctuation tests have been used to estimate the rate of gene silencing in Chinese hamster ovary cells (Holliday and Ho 1998) and in the yeast Schizosaccharomyces pombe (Singh and Klar 2002). However, fluctuation tests do not appear to be used routinely for epimutation rate estimates. For example, although the rates of spontaneous appearance and disappearance of [ISP+], a prion-like element in yeast, have been measured using the fluctuation test (Volkov et al. 2002), to the best of our knowledge there are no published estimates of the spontaneous rate of [PSI+] appearance as measured using a fluctuation test. Although results from the fluctuation test can be confounded by reverse epimutation, or back-switching, this is an issue only if the rate of back-switching is very high, e.g., 10−1–10−2 per generation (Saunders et al. 2003). This is not the case for [PSI+], for which the reverse epimutation rate (loss of [PSI+]) is <2 × 10−4 (Tank et al. 2007).

Other [PSI+]-like phenotypes, including genetic mimics:

[PSI+] causes partial loss of Sup35 function, leading to elevated rates of translational readthrough at all stop codons (Figure 1b). There are many other spontaneous changes, presumably mutations, that also lead to elevated translational readthrough (Lund and Cox 1981). Mutations that affect readthrough at all stop codons (Figure 1c) (sometimes called “[PSI+]-like”) can be considered as genetic “mimics” because they produce the same phenotype as the Sup35 aggregate, but are generally not epigenetically inherited. A specific example of such a genetic mimic was characterized by Eaglestone et al. (1999), who identified the sal3-4 point mutation in the SUP35 gene. This leads to a defect in the Sup35 protein structure rendering the termination process less efficient (Eaglestone et al. 1999). The sal3-4 mutant can therefore be considered a partial loss-of-function genetic mimic of [PSI+], since it generates the same readthrough phenotype. Translation termination could also potentially be impaired through other point mutations or deletions, for example, in either the SUP35 or the SUP45 gene (Stansfield et al. 1995a) or in a tRNA that mutates to recognize stop codons at a higher rate. The presence of genetic mimics, whose effects are less reversible than those of [PSI+], can affect the evolution of the evolvability properties of the [PSI+] system such as its epimutation rate (Lancaster and Masel 2009). Note that genetic mimics are quite different from much rarer point mutations that convert stop codons into coding sequence (Figure 1d), resulting in readthrough at a single gene rather than multiple genes.Here we performed experiments to obtain accurate and precise estimates of the baseline appearance rates of both [PSI+] and [PSI+]-like phenotypes in permissive laboratory conditions, excluding stop codon point mutations that affect only a single gene. Our estimates are superior to previous estimates, since we use the fluctuation test. We consider the consequences of these estimates for the evolution of the [PSI+] system.  相似文献   

8.
Glycoside hydrolase family 1 (GH1) β-glucosidases play roles in many processes in plants, such as chemical defense, alkaloid metabolism, hydrolysis of cell wall-derived oligosaccharides, phytohormone regulation, and lignification. However, the functions of most of the 34 GH1 gene products in rice (Oryza sativa) are unknown. Os3BGlu6, a rice β-glucosidase representing a previously uncharacterized phylogenetic cluster of GH1, was produced in recombinant Escherichia coli. Os3BGlu6 hydrolyzed p-nitrophenyl (pNP)-β-d-fucoside (kcat/Km = 67 mm−1 s−1), pNP-β-d-glucoside (kcat/Km = 6.2 mm−1 s−1), and pNP-β-d-galactoside (kcat/Km = 1.6 mm−1s−1) efficiently but had little activity toward other pNP glycosides. It also had high activity toward n-octyl-β-d-glucoside and β-(1→3)- and β-(1→2)-linked disaccharides and was able to hydrolyze apigenin β-glucoside and several other natural glycosides. Crystal structures of Os3BGlu6 and its complexes with a covalent intermediate, 2-deoxy-2-fluoroglucoside, and a nonhydrolyzable substrate analog, n-octyl-β-d-thioglucopyranoside, were solved at 1.83, 1.81, and 1.80 Å resolution, respectively. The position of the covalently trapped 2-F-glucosyl residue in the enzyme was similar to that in a 2-F-glucosyl intermediate complex of Os3BGlu7 (rice BGlu1). The side chain of methionine-251 in the mouth of the active site appeared to block the binding of extended β-(1→4)-linked oligosaccharides and interact with the hydrophobic aglycone of n-octyl-β-d-thioglucopyranoside. This correlates with the preference of Os3BGlu6 for short oligosaccharides and hydrophobic glycosides.β-Glucosidases (EC 3.2.1.21) have a wide range of functions in plants, including acting in cell wall remodeling, lignification, chemical defense, plant-microbe interactions, phytohormone activation, activation of metabolic intermediates, and release of volatiles from their glycosides (Esen, 1993). They fulfill these roles by hydrolyzing the glycosidic bond at the nonreducing terminal glucosyl residue of a glycoside or an oligosaccharide, thereby releasing Glc and an aglycone or a shortened carbohydrate. The aglycone released from the glycoside may be a monolignol, a toxic compound, or a compound that further reacts to release a toxic component, an active phytohormone, a reactive metabolic intermediate, or a volatile scent compound (Brzobohatý et al., 1993; Dharmawardhama et al., 1995; Reuveni et al., 1999; Lee et al., 2006; Barleben et al., 2007; Morant et al., 2008). Indeed, the wide range of glucosides of undocumented functions found in plants suggests that many β-glucosidase functions may remain to be discovered.Plant β-glucosidases fall into related families that have been classified as glycosyl hydrolase (GH) families GH1, GH3, and GH5 (Henrissat, 1991; Coutinho and Henrissat, 1998, 1999). Of these, GH1 has been most thoroughly documented and shown to comprise a gene family encoding 40 putative functional GHs in Arabidopsis (Arabidopsis thaliana) and 34 in rice (Oryza sativa) in addition to a few pseudogenes (Xu et al., 2004; Opassiri et al., 2006). In addition to β-glucosidases, plant GH1 members include β-mannosidases (Mo and Bewley, 2002), β-thioglucosidases (Burmeister et al., 1997), and disaccharidases such as primeverosidase (Mizutani et al., 2002) as well as hydroxyisourate hydrolase, which hydrolyzes the internal bond in a purine ring rather than a glycosidic bond (Raychaudhuri and Tipton, 2002). The specificity for the glycone in GH1 enzymes varies. Some enzymes are quite specific for β-d-glucosides or β-d-mannosides, while many accept either β-d-glucosides or β-d-fucosides, and some also hydrolyze β-d-galactosides, β-d-xylosides, and α-l-arabinoside (Esen, 1993). However, most GH1 enzymes are thought to hydrolyze glucosides in the plant, and it is the aglycone specificity that determines the functions of most GH1 enzymes.Aglycone specificity of GH1 β-glucosidases ranges from rather broad to absolutely specific for one substrate and is not obvious from sequence similarity. For instance, maize (Zea mays) ZmGlu1 β-glucosidase hydrolyzes a range of glycosides, including its natural substrate, 2-O-β-d-glucopyranosyl-4-dihydroxy-1,4-benzoxazin-3-one (DIMBOAGlc), but not dhurrin, whereas sorghum (Sorghum bicolor) Dhr1, which is 72% identical to ZmGlu1, only hydrolyzes its natural cyanogenic substrate dhurrin (Verdoucq et al., 2003). Similarly, despite sharing over 80% amino acid sequence identity, the legume isoflavonoid β-glucosidases dalcochinase from Dalbergia cochinchinensis and Dnbglu2 from Dalbergia nigrescens hydrolyze each other''s natural substrate very poorly (Chuankhayan et al., 2007). Thus, small differences in the amino acid sequence surrounding the active site may be expected to account for significant differences in substrate specificity.GH1 is classified in GH clan A, which consists of GH families whose members have a (β/α)8-barrel structure with the catalytic acid/base on strand 4 of the β-barrel and the catalytic nucleophile on strand 7 (Henrissat et al., 1995; Jenkins et al., 1995). As such, all GH1 enzymes have similar overall structures, but it has been noted that four variable loops at the C-terminal end of the β-barrel strands, designated A, B, C, and D, account for much of the differences in the active site architecture (Sanz-Aparicio et al., 1998). The similar structures with great diversity in substrate specificity make plant GH1 enzymes an ideal model system to investigate the structural basis of substrate specificity. To date, seven plant β-glucosidase structures have been reported, including three closely related chloroplastic enzymes from maize (Czjzek et al., 2000, 2001), sorghum (Verdoucq et al., 2004), and wheat (Triticum aestivum; Sue et al., 2006), the cytoplasmic strictosidine β-glucosidase from Rauvolfia serpentine (Barleben et al., 2007), and the secreted enzymes white clover (Trifolium repens) cyanogenic β-glucosidase (Barrett et al., 1995), white mustard (Sinapsis alba) myrosinase (thioglucosidase; Burmeister et al., 1997), and rice Os3BGlu7 (BGlu1; Chuenchor et al., 2008). These enzymes hydrolyze substrates with a range of structures, but they cannot account for the full range of β-glucosidase substrates available in plants, and determining the structural differences that bring about substrate specificity differences in even closely related GH1 enzymes has proven tricky (Verdoucq et al., 2003, 2004; Sue et al., 2006; Chuenchor et al., 2008).Amino acid sequence-based phylogenetic analysis of GH1 enzymes encoded by the rice genome showed that there are eight clusters containing both rice and Arabidopsis proteins that are more closely related to each other than they are to enzymes from the same plants outside the clusters (Fig. 1; Opassiri et al., 2006). In addition, there are a cluster of sixteen putative β-glucosidases and a cluster of myrosinases in Arabidopsis without any closely related rice counterparts. Comparison with characterized GH1 enzymes from other plants reveals other clusters of related enzymes not found in rice or Arabidopsis, including the chloroplastic enzymes, from which the maize, sorghum, and wheat structures are derived, and the cytoplasmic metabolic enzymes, from with the strictosidine hydrolase structure is derived (Fig. 1). Therefore, although the known structures provide good tools for molecular modeling of plant enzymes, most rice and Arabidopsis GH1 enzymes lack a close correspondence in sequence and functional evolution to these structures, suggesting that the variable loops that determine the active site may be different. It would be useful, therefore, to know the structures and substrate specificities of representative members of each of the eight clusters seen in rice and Arabidopsis. To begin to acquire this information, we have expressed Os3BGlu6, a member of cluster At/Os 1 in Figure 1, characterized its substrate specificity, and determined its structure alone and in complex with a glycosyl intermediate and a nonhydrolyzable substrate analog.Open in a separate windowFigure 1.Simplified phylogenetic tree of the amino acid sequences of eukaryotic GH1 proteins with known structures and those of rice and Arabidopsis GH1 gene products. The protein sequences of the eukaryotic proteins with known structures are marked with four-character PDB codes for one of their structures, including Trifolium repens cyanogenic β-glucosidase (1CBG; Barrett et al., 1995), Sinapsis alba myrosinase (1MYR; Burmeister et al., 1997), Zea mays ZmGlu1 β-glucosidase (1E1F; Czjzek et al., 2000), Sorghum bicolor Dhr1 dhurrinase (1V02; Verdoucq et al., 2004), Triticum aestivum β-glucosidase (2DGA; Sue et al., 2006), Rauvolfia serpentina strictosidine β-glucosidase (2JF6; Barleben et al., 2007), and Oryza sativa Os3BGlu7 (BGlu1) β-glucosidase (2RGL; Chuenchor et al., 2008) from plants, along with Brevicoryne brassicae myrosinase (1WCG; Husebye et al., 2005), Homo sapiens cytoplasmic (Klotho) β-glucosidase (2E9M; Hayashi et al., 2007), and Phanerochaete chrysosporium (2E3Z; Nijikken et al., 2007), while those encoded in the Arabidopsis and rice genomes are labeled with the systematic names given by Xu et al. (2004) and Opassiri et al. (2006), respectively. One or two example proteins from each plant are given for each of the eight clusters of genes shared by Arabidopsis (At) and rice (Os) and the Arabidopsis-specific clusters At I (β-glucosidases) and At II (myrosinases), with the number of Arabidopsis or rice enzymes in each cluster given in parentheses. These sequences were aligned with all of the Arabidopsis and rice sequences in ClustalX (Thompson et al., 1997), the alignment was manually edited, all but representative sequences were removed, and the tree was calculated by the neighbor-joining method, bootstrapped with 1,000 trials, and then drawn with TreeView (Page, 1996). The grass plastid β-glucosidases, which are not represented in Arabidopsis and rice, are shown in the group marked “Plastid.” Percentage bootstrap reproducibility values are shown on internal branches where they are greater than 60%. Except those marked by asterisks, all external branches represent groups with 100% bootstrap reproducibility. To avoid excess complexity, those groups of sequences marked with asterisks are not monophyletic and represent more branches within the designated cluster than are shown. For a complete phylogenetic analysis of Arabidopsis and rice GH1 proteins, see Opassiri et al. (2006).  相似文献   

9.
α-l-Arabinofuranosidases I and II were purified from the culture filtrate of Aspergillus awamori IFO 4033 and had molecular weights of 81,000 and 62,000 and pIs of 3.3 and 3.6, respectively. Both enzymes had an optimum pH of 4.0 and an optimum temperature of 60°C and exhibited stability at pH values from 3 to 7 and at temperatures up to 60°C. The enzymes released arabinose from p-nitrophenyl-α-l-arabinofuranoside, O-α-l-arabinofuranosyl-(1→3)-O-β-d-xylopyranosyl-(1→4)-d-xylopyranose, and arabinose-containing polysaccharides but not from O-β-d-xylopyranosyl-(1→2)-O-α-l-arabinofuranosyl-(1→3)-O-β-d-xylopyranosyl-(1→4)-O-β-d-xylopyranosyl-(1→4)-d-xylopyranose. α-l-Arabinofuranosidase I also released arabinose from O-β-d-xylopy-ranosyl-(1→4)-[O-α-l-arabinofuranosyl-(1→3)]-O-β-d-xylopyranosyl-(1→4)-d-xylopyranose. However, α-l-arabinofuranosidase II did not readily catalyze this hydrolysis reaction. α-l-Arabinofuranosidase I hydrolyzed all linkages that can occur between two α-l-arabinofuranosyl residues in the following order: (1→5) linkage > (1→3) linkage > (1→2) linkage. α-l-Arabinofuranosidase II hydrolyzed the linkages in the following order: (1→5) linkage > (1→2) linkage > (1→3) linkage. α-l-Arabinofuranosidase I preferentially hydrolyzed the (1→5) linkage of branched arabinotrisaccharide. On the other hand, α-l-arabinofuranosidase II preferentially hydrolyzed the (1→3) linkage in the same substrate. α-l-Arabinofuranosidase I released arabinose from the nonreducing terminus of arabinan, whereas α-l-arabinofuranosidase II preferentially hydrolyzed the arabinosyl side chain linkage of arabinan.Recently, it has been proven that l-arabinose selectively inhibits intestinal sucrase in a noncompetitive manner and reduces the glycemic response after sucrose ingestion in animals (33). Based on this observation, l-arabinose can be used as a physiologically functional sugar that inhibits sucrose digestion. Effective l-arabinose production is therefore important in the food industry. l-Arabinosyl residues are widely distributed in hemicelluloses, such as arabinan, arabinoxylan, gum arabic, and arabinogalactan, and the α-l-arabinofuranosidases (α-l-AFases) (EC 3.2.1.55) have proven to be essential tools for enzymatic degradation of hemicelluloses and structural studies of these compounds.α-l-AFases have been classified into two families of glycanases (families 51 and 54) on the basis of amino acid sequence similarities (11). The two families of α-l-AFases also differ in substrate specificity for arabinose-containing polysaccharides. Beldman et al. summarized the α-l-AFase classification based on substrate specificities (3). One group contains the Arafur A (family 51) enzymes, which exhibit very little or no activity with arabinose-containing polysaccharides. The other group contains the Arafur B (family 54) enzymes, which cleave arabinosyl side chains from polymers. However, this classification is too broad to define the substrate specificities of α-l-AFases. There have been many studies of the α-l-AFases (3, 12), especially the α-l-AFases of Aspergillus species (28, 1215, 17, 22, 23, 2832, 3639, 4143, 46). However, there have been only a few studies of the precise specificities of these α-l-AFases. In previous work, we elucidated the substrate specificities of α-l-AFases from Aspergillus niger 5-16 (17) and Bacillus subtilis 3-6 (16, 18), which should be classified in the Arafur A group and exhibit activity with arabinoxylooligosaccharides, synthetic methyl 2-O-, 3-O-, and 5-O-arabinofuranosyl-α-l-arabinofuranosides (arabinofuranobiosides) (20), and methyl 3,5-di-O-α-l-arabinofuranosyl-α-l-arabinofuranoside (arabinofuranotrioside) (19).In the present work, we purified two α-l-AFases from a culture filtrate of Aspergillus awamori IFO 4033 and determined the substrate specificities of these α-l-AFases by using arabinose-containing polysaccharides and the core oligosaccharides of arabinoxylan and arabinan.  相似文献   

10.
Transmembrane proteins are synthesized and folded in the endoplasmic reticulum (ER), an interconnected network of flattened sacs or tubes. Up to now, this organelle has eluded a detailed analysis of the dynamics of its constituents, mainly due to the complex three-dimensional morphology within the cellular cytosol, which precluded high-resolution, single-molecule microscopy approaches. Recent evidences, however, pointed out that there are multiple interaction sites between ER and the plasma membrane, rendering total internal reflection microscopy of plasma membrane proximal ER regions feasible. Here we used single-molecule fluorescence microscopy to study the diffusion of the human serotonin transporter at the ER and the plasma membrane. We exploited the single-molecule trajectories to map out the structure of the ER close to the plasma membrane at subdiffractive resolution. Furthermore, our study provides a comparative picture of the diffusional behavior in both environments. Under unperturbed conditions, the majority of proteins showed similar mobility in the two compartments; at the ER, however, we found an additional 15% fraction of molecules moving with 25-fold faster mobility. Upon degradation of the actin skeleton, the diffusional behavior in the plasma membrane was strongly influenced, whereas it remained unchanged in the ER.Live-cell microscopy and three-dimensional electron tomography has boosted our understanding of endoplasmic reticulum (ER) dynamics and morphology. Proteins have been identified which regulate the formation of cisternae versus tubelike membranes, and the contacts between ER and the various cellular organelles have been studied in detail (1). Little information, however, is available when it comes to protein dynamics and organization within the ER membrane. Its complex three-dimensional topology hampers standard diffraction-limited fluorescence microscopy approaches: in fluorescence recovery after photobleaching, for example, the obtained diffusion coefficients can be several-folds off, if the ER morphology is not correctly taken into account (2). A method is therefore needed which allows for resolving molecular movements on length scales below the typical dimensions of the ER structures.In principle, single-molecule tracking would provide the required spatial resolution due to the high precision in localizing the moving point emitters: localization errors of <40 nm can be easily achieved (3). This technique has given rise to multiple studies, in which the paths of the diffusing objects were used to make conclusions on the properties of the environment; particularly, the plasma membrane has become a favorite target for such investigations, yielding precise determinations of the diffusion coefficients of a variety of membrane proteins or lipids (4).Here, we report what is, to our knowledge, the first application of single-molecule tracking for a comparative study of the diffusion dynamics of a membrane protein at the ER versus the plasma membrane. As the protein of interest, we chose the human serotonin transporter (SERT): it is a polytopic membrane protein containing 12 transmembrane domains, with both C- and N-termini residing in the cytoplasm. Stable SERT oligomers of various degrees were observed to coexist in the plasma membrane (5). Functionally, SERT (6) is a pivotal element in shaping serotonergic neurotransmission: SERT-mediated high-affinity uptake of released serotonin clears the synaptic cleft and supports refilling of vesicular stores (7). Wild-type SERT (SERT-wt) is efficiently targeted to the presynaptic plasma membrane, whereas the truncation of its C-terminus (SERT-ΔC30) retains the mutant protein in the ER (8). The N-terminal mGFP- and eYFP-fusion constructs of the two versions of SERT thus allowed us to specifically address SERT located at the ER (eYFP-SERT-ΔC30) or at the plasma membrane (mGFP-SERT-wt (7)).Our experiments were performed at 37°C on proteins heterologously expressed in CHO cells. Total internal reflection (TIR) illumination afforded a reduction in background fluorescence and allowed for selective imaging of single mGFP-SERT-wt molecules at the cells’ plasma membrane or single eYFP-SERT-ΔC30 molecules at plasma membrane-proximal ER (Fig. 1 and see the Supporting Material). TIR was particularly crucial for single-molecule imaging of the ER-retained mutant, where out-of-focus background would surpass the weak single-molecule signals in epi-illumination.Open in a separate windowFigure 1Schematics of the plasma membrane (PM) and a part of the ER containing mGFP-SERT-wt or the ER-retained eYFP-SERT-ΔC30 mutant, respectively. Both can be excited by total internal reflection fluorescence (TIRF) excitation. Experiments were carried out either on cells expressing mGFP-SERT-wt or eYFP-SERT-ΔC30.For both mutants, the majority of molecules were mobile: in fluorescence-recovery-after-photobleaching experiments we observed a mobile fraction of 82 ± 8% for mGFP-SERT-wt and 91 ± 4% for eYFP-SERT-ΔC30. For single-molecule tracking, the high surface density of signals was reduced by completely photobleaching a rectangular part of the cell in epi-illumination; after a brief recovery period, a few single-molecule signals had entered the bleached area and could be monitored and tracked at high signal/noise using TIR excitation. Samples were illuminated stroboscopically for till = 2 ms, and movies of 500 frames were recorded with a delay of tdel = 6 ms; the short delay times ensured that even rapidly diffusing molecules hardly reached the borders of the ER tubes between two consecutive frames. This illumination protocol was run for 20 times per cell, yielding ∼2500 trajectories per cell.The single-molecule localizations were first used to map those areas that are accessible to the diffusing proteins. eYFP-SERT-ΔC30 showed distinct hotspots, representing plasma membrane-proximal ER, excitable by the evanescent field (Fig. 2 A). These hotspots hardly moved within the timescale of an experiment (tens of minutes, see Fig. S1 in the Supporting Material); indeed, remarkable ER stability was previously observed using superresolution microscopy (9). In contrast, a rather homogeneous distribution was observed for mGFP-SERT-wt in the plasma membrane (Fig. 2 B).Open in a separate windowFigure 2Superresolution and tracking data at the ER and the plasma membrane. Superresolution images are shown for the ER-retained SERT mutant eYFP-SERT-ΔC30 (A) and for mGFP-SERT-wt in the plasma membrane (B). (C and D) Diffusion coefficients of eYFP-SERT-ΔC30 (C) and mGFP-SERT-wt (D) are shown as normalized histograms before (blue) and after (red) Cytochalasin D treatment. Data were fitted by Gaussian mobility distributions (see Table S1 in the Supporting Material for the fit results).Next, we compared the mobility of the observed proteins. Single-molecule localizations were linked to trajectories as described in Gao and Kilfoil (10), and the apparent diffusion coefficient, D, of each molecule was estimated from the first two points of the mean-square displacement membrane. The distribution of log10 D showed a pronounced single peak (Fig. 2 D). It could be well fitted by a linear combination of two Gaussian functions, with the major fraction (85%) characterized by Dwt = 0.30 μm2/s; a broad shoulder to the left indicates the presence of proteins that are immobilized during the observation period. In contrast, the mobility of the ER-retained mutant showed a substantially different distribution, containing two clearly visible peaks (Fig. 2 C). We fitted the data with a three-component Gaussian model: the main fraction (82%) behaved similar to SERT at the plasma membrane, with DΔC30 = 0.32 μm2/s. In addition, a large fraction (15%) with high mobility of DΔC30 = 7.8 μm2/s and a minor fraction (3%) with low mobility was observed. The proteins responded as expected to degradation of the actin membrane skeleton (red bars in Fig. 2, C and D): at the plasma membrane, the mobility of mGFP-SERT-wt increased 4.6-fold (mean values), whereas at the ER membrane there was only a minor change for eYFP-SERT-ΔC30 mobility (1.06-fold increase; note that the ER is not connected to actin filaments (11)).The observation of a high mobility subfraction at the ER membrane is surprising. In general, the presence of obstacles—irrespective of whether randomly distributed or clustered, mobile or immobile—reduces the diffusivity of mobile tracers in a membrane (12). It is generally assumed that the high protein density in cell membranes is responsible for the rather low fluidity when compared to synthetic membranes (compare, e.g., Saxton and Jacobson (13) with Weiss et al. (14)). Interestingly, the observed diffusion constant of 7.8 μm2/s is of similar order as the mobility determined for various proteins in synthetic lipid membranes (14). It is thus tempting to hypothesize the presence of extended protein-depleted regions of higher fluidity within the ER membrane; such membrane domains were indeed observed already at the plasma membrane (15). We were also concerned, however, that protein degradation fragments could have contributed to our data: the three-dimensional mobility of an 85-kDa protein is ∼10 μm2/s (16), similar to the high mobility diffusion constant of eYFP-SERT-ΔC30.We tested the two explanations by analyzing the spatial distribution of fast (DΔC30 > 1 μm2/s) versus slow trajectories (DΔC30 < 1 μm2/s) of eYFP-SERT-ΔC30 (Fig. 3). Both types of trajectories clustered in the same regions, and no segregation into ER subdomains was observable at the resolved length scales. This finding—on the one hand—disfavors freely diffusing protein fragments as the origin of the high mobility fraction. On the other hand, it calls for further experiments to identify the origin of the fast and the slow mobility subfraction. Interestingly, when analyzing all eYFP-SERT-ΔC30 trajectories we found that 80% of the molecules showed diffusion confined to domains of 230-nm radius (see Fig. S2). This size is clearly smaller than the lateral extensions of the visible ER regions observed in Fig. 3. The finding indicates domain formation at the ER membrane; domains are averaged out in Fig. 3 due to the long recording times. Note that free diffusion was observed for mGFP-SERT-wt at the plasma membrane (5).Open in a separate windowFigure 3Ripley’s K function analysis of the different mobility fractions in the ER. For the cell presented in Fig. 2, the first position of every slow (D < 1 μm2/s; red) and fast (D > 1 μm2/s; blue) trajectory was plotted in panel A. Contour lines indicate regions of ER attachment to the plasma membrane. In panel B, the point-correlation function L(r)−r is plotted for the slow (red) and fast (blue) fraction. Furthermore, the correlation between fast versus slow is plotted (green). All three curves show a peak at ∼450 nm, which agrees with the extensions of the ER attachment zones.In conclusion, we have shown that single-molecule tracking is feasible for constituents of the ER membrane. We found a surprising diffusion behavior of SERT resulting in the following:
  • 1.A slow fraction showing mobility reminiscent of protein diffusion in the plasma membrane, likely reflecting SERT diffusing in protein-crowded regions of the ER membrane; and
  • 2.A fast fraction showing 25-fold faster diffusion kinetics.
This likely represents diffusion in altered ER membrane environments, possibly of different lipid or protein composition. Given the fact that synthesis of virtually all membrane proteins and most lipids proceeds at the ER membrane, ER heterogeneity at the nanoscale due to the continuous synthesis activity and selection for correct folding appears highly plausible.  相似文献   

11.
12.
A novel lachrymatory factor synthase (LFS) was isolated and purified from the roots of the Amazonian medicinal plant Petiveria alliacea. The enzyme is a heterotetrameric glycoprotein comprised of two α-subunits (68.8 kD each), one γ-subunit (22.5 kD), and one δ-subunit (11.9 kD). The two α-subunits are glycosylated and connected by a disulfide bridge. The LFS has an isoelectric point of 5.2. It catalyzes the formation of a sulfine lachrymator, (Z)-phenylmethanethial S-oxide, only in the presence of P. alliacea alliinase and its natural substrate, S-benzyl-l-cysteine sulfoxide (petiveriin). Depending on its concentration relative to that of P. alliacea alliinase, the LFS sequesters, to varying degrees, the sulfenic acid intermediate formed by alliinase-mediated breakdown of petiveriin. At LFS:alliinase of 5:1, LFS sequesters all of the sulfenic acid formed by alliinase action on petiveriin, and converts it entirely to (Z)-phenylmethanethial S-oxide. However, starting at LFS:alliinase of 5:2, the LFS is unable to sequester all of the sulfenic acid produced by the alliinase, with the result that sulfenic acid that escapes the action of the LFS condenses with loss of water to form S-benzyl phenylmethanethiosulfinate (petivericin). The results show that the LFS and alliinase function in tandem, with the alliinase furnishing the sulfenic acid substrate on which the LFS acts. The results also show that the LFS modulates the formation of biologically active thiosulfinates that are downstream of the alliinase in a manner dependent upon the relative concentrations of the LFS and the alliinase. These observations suggest that manipulation of LFS-to-alliinase ratios in plants displaying this system may provide a means by which to rationally modify organosulfur small molecule profiles to obtain desired flavor and/or odor signatures, or increase the presence of desirable biologically active small molecules.Lachrymatory factor synthase (LFS) is the term coined to refer to the recently discovered enzyme shown to catalyze the formation of the sulfine responsible for the lachrymatory effect of onion (Allium cepa), (Z)-propanethial S-oxide (PTSO; Imai et al., 2002). Until the discovery of the onion LFS, the formation of the onion lachrymatory factor (LF) was thought to be mediated by only a single enzyme, onion alliinase. Alliinases, which are pyridoxal 5′-P (PLP)-dependent Cys sulfoxide lyases most often found in members of the Allium genus, catalyze the breakdown of Cys sulfoxide derivatives to yield fleeting sulfenic acid intermediates and α-aminoacrylic acid (Scheme 1; Block, 1992; Shimon et al., 2007). Once formed, the sulfenic acids are most often observed to spontaneously condense with loss of water to form thiosulfinates, whereas the α-aminoacrylic acid is further hydrolyzed with loss of ammonia to form pyruvate. The S-substituted Cys sulfoxides that are acted upon by alliinases differ from one another by the identity of the sulfur-bound R group. In Allium plants, the R groups are alk(en)yl, with R = methyl and 2-propenyl appearing in large quantities in garlic (Allium sativum) and R = methyl and (E)-1-propenyl preponderating in onion (Scheme 1). The Cys sulfoxide that serves as the precursor of the onion lachrymator is (E)-S-(1-propenyl)-l-Cys sulfoxide (isoalliin). It is structurally distinct from other naturally occurring S-substituted Cys sulfoxides so far reported in that it is α,β-unsaturated. This structural feature affords its corresponding 1-propenylsulfenic acid (PSA) the possibility of undergoing a [1,4]-sigmatropic rearrangement that, in principle, would furnish the onion lachrymator, PTSO. Indeed, the formation of the onion lachrymator was proposed to occur by such a mechanism (Scheme 2; Block, 1992). Thus, it was surmised that were the α,β-unsaturation to be absent in the precursor S-substituted Cys sulfoxide, the [1,4]-sigmatropic rearrangement that would lead to sulfine formation could not occur. Consequently, it was not surprising that other S-substituted Cys sulfoxides constitutively present in garlic, onion, and other alliinase-containing plants, but devoid of this α,β-unsaturation in the sulfur-bound R group, did not themselves yield lachrymators on plant tissue wounding. It has since been discovered, however, that formation of the onion lachrymator is not catalyzed by onion alliinase, but instead by a novel class of enzyme—LFS. Imai et al. (2002) observed that although a crude preparation of onion alliinase yielded both the LF and the corresponding thiosulfinate, the protein fraction with lachrymator-forming ability could be completely separated from that with alliinase activity by passing the crude onion protein preparation through a hydroxyapatite column. The LFS was subsequently purified and shown to be highly substrate specific, producing the LF from only (E)-S-(1-propenyl)-l-Cys sulfoxide (isoalliin), which occurs constitutively in onion. Interestingly, the LF was detected only when three components, namely, the purified onion alliinase, isoalliin, and the onion LFS, were present in the reaction mixture simultaneously (Imai et al., 2002). Omission of the LFS from the reaction mixture resulted in an increased yield of thiosulfinates, but no LF. Although the complete cDNA sequence of the onion LFS has been determined (Imai et al., 2002), to our knowledge, full biochemical characterization of the enzyme has yet to be reported.Open in a separate windowScheme 1.Alliinase-mediated formation of thiosulfinates from Cys sulfoxide precursors (Block, 1992; Shimon et al., 2007). Alliin is S-allyl-l-Cys sulfoxide, isoalliin is (E)-S-(1-propenyl)-l-Cys sulfoxide, methiin is S-methyl-l-Cys sulfoxide, and propiin is S-propyl-l-Cys sulfoxide.Open in a separate windowScheme 2.Mechanism advanced by Block (1992) to account for formation of the onion lachrymator, PTSO. Alliinase-bound PLP forms a Schiff base with bound isoalliin. General base catalysis at the active site yields an α,β-unsaturated sulfenic acid that can undergo a [1,4]-sigmatropic rearrangement to furnish the sulfine.In the course of our studies on the organosulfur chemistry of non-Allium plants, we isolated and characterized the S-benzyl-l-Cys sulfoxides (petiveriins) and S-(2-hydroxyethyl)-l-Cys sulfoxides (2-hydroxyethiins) from the Amazonian medicinal plant Petiveria alliacea (Fig. 1; Kubec and Musah, 2001; Kubec et al., 2002). These compounds are S-substituted Cys sulfoxide derivatives with R = benzyl and 2-hydroxyethyl, respectively, that, to our knowledge, had never before been isolated from plants. We showed that, as has been observed in garlic and onion, symmetrical and mixed thiosulfinate derivatives of the corresponding petiveriin and 2-hydroxyethiin precursors could be extracted with ether solvent (Fig. 1; Kubec et al., 2002) upon root tissue disruption. We have also shown that an alliinase that mediates the transformation of the petiveriins and 2-hydroxyethiins to their corresponding thiosulfinates is present in P. alliacea (Musah et al., 2009). Interestingly, while working with P. alliacea root extracts, we noted the presence of a potent lachrymator that we subsequently determined to be a sulfine—(Z)-phenylmethanethial S-oxide (PMTSO; Fig. 2; Kubec et al., 2003). However, the biochemical precursor of PMTSO and the pathway(s) leading to its formation upon disruption of P. alliacea tissue remain to be determined. Given that the onion LF (PTSO), whose formation is mediated by an LFS, is also a sulfine, we were prompted to investigate the possibility of the presence of a LFS in P. alliacea. In this report, we describe our confirmation of the existence of a LFS in P. alliacea, and detail biochemical characterization of this novel class of enzymes.Open in a separate windowFigure 1.Cys sulfoxides and their corresponding thiosulfinate derivatives isolated from the Amazonian medicinal plant P. alliacea. The breakdown of the Cys sulfoxides is mediated by P. alliacea alliinase.Open in a separate windowFigure 2.Lachrymatory sulfine isolated from P. alliacea.  相似文献   

13.
The BH3-only protein Bim is a potent direct activator of the proapoptotic effector protein Bax, but the structural basis for its activity has remained poorly defined. Here we describe the crystal structure of the BimBH3 peptide bound to BaxΔC26 and structure-based mutagenesis studies. Similar to BidBH3, the BimBH3 peptide binds into the cognate surface groove of Bax using the conserved hydrophobic BH3 residues h1–h4. However, the structure and mutagenesis data show that Bim is less reliant compared with Bid on its ‘h0'' residues for activating Bax and that a single amino-acid difference between Bim and Bid encodes a fivefold difference in Bax-binding potency. Similar to the structures of BidBH3 and BaxBH3 bound to BaxΔC21, the structure of the BimBH3 complex with BaxΔC displays a cavity surrounded by Bax α1, α2, α5 and α8. Our results are consistent with a model in which binding of an activator BH3 domain to the Bax groove initiates separation of its core (α2–α5) and latch (α6–α8) domains, enabling its subsequent dimerisation and the permeabilisation of the mitochondrial outer membrane.The intrinsic pathway to apoptosis is regulated by interactions between members of three factions of the Bcl-2 protein family: the BH3-only proteins such as Bim and Bid, which initiate the process, the essential effectors Bax and Bak, and the prosurvival members, which oppose the action of both other factions.1 The interactions between prosurvival Bcl-2 family members and BH3 peptides have been well characterised as the earliest studies with Bcl-xL and a BakBH3 peptide.2 Such complexes are readily formed in solution by incubating the C-terminally (ΔC) truncated prosurvival Bcl-2 protein with a BH3 peptide. The absence of the C-terminal segment that can anchor the Bcl-2 protein in a membrane apparently has little effect on the ensuing complex. That complex is believed to be responsible for the antiapoptotic function of Bcl-2, by sequestration of the BH3 motif either of the so-called BH3-only proteins such as Bim (''mode 1'') or of Bax or Bak (''mode 2'').3Although proapoptotic Bax and Bak have very similar three-dimensional structures to their prosurvival relatives,4, 5, 6 until recently7, 8 no structure of a complex of either Bax or Bak with a BH3 peptide had been captured, despite an accumulation of evidence that Bax and Bak could be activated directly by interaction with the BH3-only proteins Bid, Bim and possibly others.9, 10, 11, 12, 13Unlike Bak, which is constitutively anchored in the mitochondrial outer membrane (MOM) via its C-terminal segment, Bax is largely cytosolic in healthy cells and accumulates at the MOM only upon a death signal.14, 15 There it is believed to display at least two different conformers,16, 17 one loosely associated with the MOM and another in which its membrane anchor (helix α9) is inserted into the MOM. In striking contrast to the antiapoptotic relatives of Bcl-2, a construct of Bax lacking its C-terminal membrane anchor, BaxΔC21, has no measurable interaction with BH3 peptides. However, in the presence of the detergent octylglucoside binding is detected by surface plasmon resonance (SPR) for the BH3 peptides of Bim, Bid, Bak and Bax itself with IC50s in the range of 0.1–1μM,7, 18 some 100-fold weaker compared with those measured similarly with (for example) Bcl-xLΔC, where no detergent is required. Weaker interactions between BidBH3 or BimBH3 and BaxΔC as compared with Bcl-xLΔC are not inconsistent with various models for the function of the Bcl-2 protein family whereby the prosurvival molecules sequester BH3 motifs with high affinity and long half-lives, but proapoptotic Bax and Bak are activated by transient (‘hit-and-run'') interactions with BH3 motifs.19, 20, 21Complexes of BaxΔC21 bound to BH3 peptides from Bid and Bax have been prepared by coincubation of the protein with CHAPS and an excess of the peptides.7 Under these conditions, the protein undergoes a conformational change and dimerises via domain swapping of helical segments α2–α5 and α6–α8, dubbed ‘core'' and ‘latch'' domains, respectively. Although this ‘core/latch dimer'' is thought to be an in vitro artefact, its formation is diagnostic for the core and latch separation, which is required for membrane-associated Bax to dimerise via its core domains and then to permeabilise the MOM.7 If the latch domain is absent, as in a recombinant construct of GFP fused to Bax α2–α5, the core domain forms BH3:groove symmetric dimers,7 which, consistent with a wide body of evidence,21, 22, 23, 24, 25 are present in apoptotic pores.Previous work7 highlighted the importance of two hydrophobic ‘h0'' residues (Figure 1) in the peptide (I82/I83 in BidBH3) in governing Bid''s ability to activate Bax. Similar to Bid, Bim is also a potent direct activator of Bax, and the ‘h0'' amino acids in Bim are proline and glutamic acid. In the absence of a structure of BimBH3:BaxΔC, it remained unclear how these ‘h0'' residues were accommodated. Here we describe the crystal structures of BimBH3 26- and 20-mer peptides bound to BaxΔC26. Comparison with the structure of BidBH3:BaxΔC21 allows a dissection of the critical contacts between these two peptides and BaxΔC. The binding profiles of mutant BH3 peptides illustrate that BimBH3 binding to Bax is less dependent on the ‘h0'' residues compare with that in the case for BidBH3. The BimBH3 complex displays a similar cavity adjacent to Bax α1, α2, α5 and α8 as seen in the BidBH3 complex. We also describe a structure of BidBH3 bound to a BaxΔC21 mutant, I66A, which is more typical of the BH3 signature of antiapoptotic Bcl-2 family proteins7, 26Open in a separate windowFigure 1BimBH3 binds BaxΔC. (a) BH3 peptide sequences used in this study, indicating the 5 hydrophobic amino-acid positions ‘h0''–‘h4''. (b) The core/latch dimer of BaxΔC26 bound to BimBH3. The two Bax polypeptides, shown here as cartoons, are coloured yellow and grey, and the two Bim peptides cyan and orange. A crystallographic dyad symmetry axis passes through the centre of this particle. (c) Structure of BimBH3:BaxΔC26 complex. The globular unit depicted comprises Bax residues 1–128 from one polypeptide and 129–166 from the other, together with the associated Bim peptide. Bax is represented by its surface and colour coded according to surface charge (blue, positive potential (4kT/e); red, negative potential (−2kT/e); calculated using the Adaptive Poisson–Boltzmann Solver.41 The trace of the Bim peptide (cyan) is shown with ‘h0'' (P144, E145), ‘h1'' (I148), ‘h2'' (L152), ‘h3'' (I155) and ‘h4'' (F159) represented as sticks. (d) Overlay of BimBH3:BaxΔC26 with BidBH3:BaxΔC21 (PDB:4BD2). Structures represented as cartoon ribbons, yellow for Bax in the Bim complex and magenta for Bax in the Bid complex. The peptides (Bim cyan and Bid blue) stand vertically in the foreground in this view (similar to Figure 1c), with their N termini at the bottom of the figure  相似文献   

14.
Two nonoverlapping autosomal inversions defined unusual neo-sex chromosomes in the Hessian fly (Mayetiola destructor). Like other neo-sex chromosomes, these were normally heterozygous, present only in one sex, and suppressed recombination around a sex-determining master switch. Their unusual properties originated from the anomalous Hessian fly sex determination system in which postzygotic chromosome elimination is used to establish the sex-determining karyotypes. This system permitted the evolution of a master switch (Chromosome maintenance, Cm) that acts maternally. All of the offspring of females that carry Cm-associated neo-sex chromosomes attain a female-determining somatic karyotype and develop as females. Thus, the chromosomes act as maternal effect neo-W''s, or W-prime (W′) chromosomes, where ZW′ females mate with ZZ males to engender female-producing (ZW′) and male-producing (ZZ) females in equal numbers. Genetic mapping and physical mapping identified the inversions. Their distribution was determined in nine populations. Experimental matings established the association of the inversions with Cm and measured their recombination suppression. The inversions are the functional equivalent of the sciarid X-prime chromosomes. We speculate that W′ chromosomes exist in a variety of species that produce unisexual broods.SEX chromosomes are usually classified as X, Y, Z, or W on the basis of their pattern of segregation and the gender of the heterogametic sex (Ohno 1967). However, when chromosome-based sex determination occurs postzygotically, the same nomenclature confounds important distinctions and may hide interesting evolutionary phenomena. The Hessian fly (Mayetiola destructor), a gall midge (Diptera: Cecidomyiidae) and an important insect pest of wheat, presents an excellent example (Stuart and Hatchett 1988, 1991). In this insect, all of the female gametes and all of the male gametes have the same number of X chromosomes (Figure 1A); no heterogametic sex exists. Nevertheless, Hessian fly sex determination is chromosome based; postzygotic chromosome elimination produces different X chromosome to autosome ratios in somatic cells (male A1A2X1X2/A1A2OO and female A1A2X1X2/A1A2X1X2, where A1 and A2 are the autosomes, X1 and X2 are the X chromosomes, and the paternally derived chromosomes follow the slash) (Stuart and Hatchett 1991; Marin and Baker 1998). Thus, Hessian fly “X” chromosomes are defined by their haploid condition in males, rather than by their segregation in the gametes.Open in a separate windowFigure 1.—Chromosome behavior and sex determination in the Hessian fly. (A) Syngamy (1) establishes the germ-line chromosome constitution: ∼32 maternally derived E chromosomes (represented as a single white chromosome) and both maternally derived (black) and paternally derived (gray) autosomes and X chromosomes. During embryogenesis, while the E chromosomes are eliminated, the paternally derived X chromosomes are either retained (2) or excluded (3) from the presumptive somatic cells. When the paternally derived X chromosomes are retained (2), a female-determining karyotype is established. When they are eliminated (3), a male-determining karyotype is established. Thelygenic mothers carry Cm (white arrow), which conditions all of their offspring to retain the X chromosomes. Recombination occurs during oogenesis (4). All ova contain a full complement of E chromosomes and a haploid complement of autosomes and X chromosomes. Chromosome elimination occurs during spermatogenesis (5). Sperm contain only the maternally derived autosomes and X chromosomes. (B) The segregation of Cm (white dot) on a Hessian fly autosome among monogenic families. Thelygenic females produce broods composed of equal numbers of thelygenic (Cm/−) and arrhenogenic (−/−) females (box 1). Arrhenogenic females produce males (box 2). (C) Matings between monogenic and amphigenic families. Cm (white dot) is dominant to the amphigenic-derived chromosomes (gray dot) and generates all-female offspring (box 3). Amphigenic-derived chromosomes are dominant to the arrhenogenic-derived chromosomes (no dot) and generate offspring of both sexes (box 4).An autosomal, dominant, genetic factor called Chromosome maintenance (Cm) complicates Hessian fly sex determination further (Stuart and Hatchett 1991). Cm has a maternal effect that acts upstream of X chromosome elimination during embryogenesis (Figure 1A). It prevents X chromosome elimination so that all of the offspring of Cm-bearing mothers obtain a female-determining karyotype. Cm-bearing females produce only female offspring and are therefore thelygenic. The absence of Cm usually has the opposite effect; all of the offspring of most Cm-lacking females obtain a male-determining karyotype. These Cm-lacking females produce only male offspring and are therefore arrhenogenic. Like a sex-determining master switch, Cm is usually heterozygous and present in only one sex (Figure 1B). Thus, thelygenic females (Cm/−) are “heterogametic,” as their Cm-containing gametes and Cm-lacking gametes produce thelygenic (Cm/−) and arrhenogenic (−/−) females in a 1:1 ratio. Collectively, thelygenic and arrhenogenic females are called monogenic because they produce unisexual families. However, some Hessian fly females produce broods of both sexes and are called amphigenic. No mating barrier between monogenic and amphigenic families exists (Figure 1C), but amphigenic females have always been found in lower abundance (Painter 1930; Gallun et al. 1961; Stuart and Hatchett 1991). In experimental matings, the inheritance of maternal phenotype was consistent with the segregation of three Cm alleles (Figure 1C): a dominant thelygenic allele, a hypomorphic amphigenic allele, and a null arrhenogenic allele (Stuart and Hatchett 1991).Here we report the genetic and physical mapping of Cm on Hessian fly autosome 1 (A1). Two nonoverlapping inversions were identified that segregated perfectly with Cm. The most distal inversion was present in all thelygenic females examined. The more proximal inversion extended recombination suppression. These observations suggested that successive inversions evolved to suppress recombination around Cm after it arose. The inversions therefore appear to have evolved in response to the forces that shaped vertebrate Y and W chromosomes (Charlesworth 1996; Graves and Shetty 2001; Rice and Chippindale 2001; Carvalho and Clark 2005). We therefore believe the inversion-bearing chromosomes may be classified as maternal effect neo-W''s.  相似文献   

15.
Fran Supek  Tomislav ?muc 《Genetics》2010,185(3):1129-1134
A recent investigation concluded that codon bias did not affect expression of green fluorescent protein (GFP) variants in Escherichia coli, while stability of an mRNA secondary structure near the 5′ end played a dominant role. We demonstrate that combining the two variables using regression trees or support vector regression yields a biologically plausible model with better support in the GFP data set and in other experimental data: codon usage is relevant for protein levels if the 5′ mRNA structures are not strong. Natural E. coli genes had weaker 5′ mRNA structures than the examined set of GFP variants and did not exhibit a correlation between the folding free energy of 5′ mRNA structures and protein expression.IN genomes, natural selection may act on silent sites of codons to make translation of highly expressed genes more efficient, an effect linked primarily to abundances of tRNA isoacceptor molecules (Ikemura 1985; Bulmer 1987; Kanaya et al. 1999). Codon choice may also be linked to formation of secondary structures in mRNA that reduce protein levels, as has been shown with haplotypes of the human COMT gene (Nackley et al. 2006). Kudla et al. (2009) have recently reported an experiment that contributes toward understanding how synonymous codon usage shapes gene expression. They have constructed a library of 154 synthetic variants of a green fluorescent protein (GFP) gene that varied randomly at synonymous sites while retaining the original amino acid sequence. The authors concluded that codon usage (CU) bias did not correlate with protein levels measured as fluorescence of the GFP, but also that the minimum free energy of a mRNA secondary structure in a 42-nucleotide region at [−4,37] that overlaps the start codon (“hairpin stability”) bears a great significance. CU bias was quantified by the widely used codon adaptation index (CAI) method (Sharp and Li 1987), essentially a measure of the distance of a gene''s codon usage to the codon usage of a predefined set of highly expressed genes. The CAI and some of its more recent alternatives, such as measure independent of length and composition (MILC) (Supek and Vlahovicek 2005), have been shown to be a viable surrogate for gene expression in various unicellular organisms. Additionally, in a multiple linear regression of rank fluorescence against a number of sequence-derived attributes, including CAI and the abovementioned hairpin stability, Kudla et al. (2009) did not find CAI to contribute significantly toward the prediction of protein levels, in contrast to the hairpin stability.

Both the codon adaptation index and the 5′ mRNA secondary structures influence protein levels in the Kudla et al. data:

The described statistical analyses, however, failed to address the case in which a nonlinear three-way dependency between hairpin stability, codon usage, and fluorescence might exist; data are visualized in Figure 1, A–C, and in figure 2B in Kudla et al. Such complex patterns in data are readily captured by the support vector machines (SVM) algorithm, reviewed in Noble (2006) and Ben-Hur et al. (2008). We have employed the SVM with a radial basis function kernel to regress fluorescence against both hairpin stability and CAI simultaneously (Figure 1B) and computed the Pearson''s correlation coefficient in cross-validation (here denoted as Q) between true and predicted values of fluorescence (See File S1). A linear model based solely on hairpin stability as employed by Kudla et al. (Figure 1A) can explain Q2 = 38.6% of variance in protein levels, while the nonlinear SVM regression that takes CAI into account explains Q2 = 52.2% of variance. The difference in Q is statistically significant at P = 10−190 (paired t-test). Note that Kudla et al. utilize the Spearman rank correlation coefficient (ρ) in their article; the hairpin stability would explain ρ2 = 44.6% of the variance in expression levels if the requirement for a linear relationship was abandoned in this manner.Open in a separate windowFigure 1.—Regression of protein levels against folding free energy of an mRNA hairpin at nucleotides −4 through 37 (A), against the hairpin free energy and the codon adaptation index (Sharp and Li 1987) (B and C), or against the hairpin free energy and the codon frequencies (D and E). The colors show the measured protein levels, while the background shading reflects the protein levels predicted by the specific model. (A) Predictions by linear regression. (B and E) Predictions by a support vector machine with a radial basis function kernel. (C) Predictions by an M5′ regression tree. (D) A schematic of the M5′ model, where coefficients in the terminal nodes are derived from data where protein levels, all codon frequencies, and hairpin free energies were normalized to [0,1] to facilitate comparison between the influence of codons, the hairpin stability, and the constant in the regression equation. All coefficients ≥0.1 are in boldface type. In the plots (A–C and E), a slight amount of random “jitter” was introduced to the point positions (at most, 3% of the range of each axis) to better visualize overlapping points. In the plot in E, a single outlying point is not shown. See Figure S2 for the same plots without jitter and with the outlier in E included. R2 is the squared Pearson''s correlation coefficient between actual and model-predicted protein levels; Q2 is similar, but obtained in cross-validation (10-fold, 100 runs), and is a more conservative estimate of regression accuracy.Open in a separate windowFigure 2.—The distributions of RNA folding free energies of a 42-nucleotide window in the mRNA between positions −4 and +37, where the “A” in the “AUG” start codon has index zero. The distributions are shown separately for the 154 gene variants from Kudla et al. (2009) and for the genes from the E. coli K12 genome. The dotted line indicates the 5th percentile of the E. coli values at −10.9 kcal/mol.Compared to the SVM, a more interpretable generalization of the data can be achieved by a different nonlinear regression approach, the M5′ tree (Wang and Witten 1997), which recursively divides the data to reduce the variance of the dependant variable within each partition and then builds separate linear models for the partitions. The resulting regression tree (Figure 1C; supporting information, Figure S1) better explains the correlation between protein levels on one side and hairpin stability and CAI on the other side when compared to a linear model employed by Kudla et al. that regresses protein levels against hairpin stability only [see figure 2B in Kudla et al. (2009) and Figure 1A]; 9.3% more variance is explained by the M5′, P = 10−91 (paired t-test). An interpretation that follows from the general structure of the M5′ tree (Figure S1) is that, at high mRNA hairpin stability, protein levels will generally be quite low and not dependant on CAI; in contrast, with less stable mRNA hairpins, both hairpin stability and CAI play a role in determining protein levels. In the interpretation of the M5′ tree structure, we would place less emphasis on the exact coefficients of the linear models in the leaves because the reliability of these fine-grained features of the M5′ model can strongly depend on the good coverage of all parts of the mRNA–CAI space data points.

The CAI may not be an optimal summary of codon usage for predicting expression of overexpressed genes:

Regarding use of CAI in the present context, it should be noted that CAI''s original purpose was to serve as a proxy for gene expression in conditions of abundance that result in fast growth in the organism''s environmental niche. The CAI or related approaches (Supek and Vlahovicek 2005) may not, however, be an ideal representation of codon usage when examining overexpression of a foreign protein at levels that exceed the natural abundances of the host''s most highly expressed proteins. This was indeed shown to be the case in a recent article by Welch et al. (2009) in which the authors reported an experiment with heterologous expression of variants of two proteins in E. coli: an antibody fragment and a phage DNA polymerase. Welch et al. found that codon frequencies in general, but not CAI specifically, correlated well with protein levels and postulated that for overexpressed proteins optimal codons would correspond to the codons translated efficiently under amino acid starvation (Elf et al. 2003; Dittmar et al. 2005). Analogously to Welch et al., we now apply our regression algorithms not to the CAI, but directly to the codon frequencies that CAI attempts to summarize in the Kudla et al. data (See File S2). An M5′ regression tree trained on the hairpin stability and codon frequencies (Figure 1D) explains 10.6% more variance (P = 10−83, paired t-test) in protein levels than an M5′ tree trained on hairpin stability and CAI (Figure 1C, Figure S1). A SVM regression model trained on the hairpin stability and a simple linear combination of selected codon frequencies (Figure 1E) explains 8.8% more variance (P = 10−82, paired t-test) than the SVM that uses CAI (Figure 1B). An SVM trained on the hairpin stability and the full set of codon frequencies (not shown in Figure 1) explains Q2 = 65.0% of variance in the protein abundances, a sizable increase (P ≈ 10−260, paired t-test) compared to a linear regression on solely the [−4,37] hairpin stability (Q2 = 38.6%) as originally employed by Kudla et al. and also as compared to a set of randomized controls (Q2 = 20.1–30.7%; Table S1). Therefore, not relying on a predefined notion of codon optimality—as embodied in the CAI—further strengthens the argument that the correlation of CU and protein levels is far from negligible in this data set.Additionally, we found some correlation between codon frequencies and 5′ mRNA hairpin stability in the Kudla et al. gene variants (Figure S4). The fact that the two factors were not completely independent adds weight to the relevance of CU to protein levels since one could not be certain that even the variance in protein levels explained by 5′ mRNA structures is wholly due to the structures themselves and not to the confounding variables—here, the codon frequencies.The M5′ tree trained on codon frequencies (Figure 1D) follows the same general structure as the M5′ tree trained on the CAI (Figure S1) where the codon frequencies become relevant with mRNA hairpins weaker than −9.75 kcal/mol, while with stronger [−4,37] mRNA hairpins protein levels are generally low. Our interpretation is that the lack of a stable secondary structure that could obstruct translational initiation is a necessary but not a sufficient condition for high protein expression. When the initiation phase is unhindered, the bottleneck would shift to the elongation phase in which codon optimality plays an important role. In the literature, theoretical models of translation may consider either the initiation (Bulmer 1991) or the elongation phase (Xia 1998) as the rate-limiting step of translation under physiological conditions; we are not aware of such analyses describing translation of artificially overexpressed genes.The codons identified as relevant by our M5′ model of the Kudla et al. data are different from, but not inconsistent with, those proposed by Welch et al. (Table S2). We anticipate that the rules for codon optimality for overexpression in an Escherichia coli host will become better defined as more large-scale experiments, such as the two discussed here (Kudla et al. 2009; Welch et al. 2009), are carried out.

The “RNA structure + codon usage” model agrees with independent experimental data and is robust to removal of extreme values:

Our reanalysis of the Kudla et al. data should be viewed in light of the conclusions of Welch et al. (2009) who find that codon usage, but not the 5′ hairpin stability, correlates with protein levels in their data, while noting that their gene variants generally have considerably weaker 5′ mRNA hairpins than the sequences in Kudla et al. Welch et al. reconcile the different outcomes of the two experiments by noting that “inhibition of initiation by especially strong mRNA structure would obscure effects resulting from factors that influence elongation, such as codon usage” (page 9). Here we propose that precisely the same model can be derived solely from the Kudla et al. data. Furthermore, we find that the 154 gene variants from Kudla et al. indeed do have unusually stable 5′ mRNA hairpins (mean free energy = −9.68 kcal/mol) in comparison to natural E. coli genes (mean free energy = −6.15 kcal/mol) (P = 10−38 by Mann–Whitney U-test; see Figure 2). The part of the distribution of Kudla et al. gene variants that overlaps with the bulk of the E. coli genes, with 5′ mRNA hairpin free energies lower than ∼ −10 kcal/mol, corresponds to the range where our M5′ model indicates a stronger influence of CU on protein levels (Figure S1, Figure 1D).We investigate to what extent the presence of a group of sequences extreme in their 5′ mRNA hairpin stabilities in the Kudla et al. data set (left peak in Figure 2) influenced the authors'' conclusion that the hairpin stabilities have an overarching influence on protein levels. After removing the sequences below the 5th percentile of the E. coli natural hairpin stabilities (−10.9 kcal/mol), we were left with 109 of the original 154 Kudla et al. sequences. The accuracy of regressing protein levels against mRNA hairpin stability deteriorates greatly (Q2 = 18.5%) after removing the 45 sequences, but less so with SVM and M5′ regression that take into account both CU and the hairpin stability (udla et al. basically captured the difference between these extreme cases—in which very strong 5′ mRNA secondary structures blocked expression—and all other sequences. However, to explain the variation in protein levels within the nonextreme set, hairpin stabilities by themselves are not sufficient and need to be complemented with CU.

TABLE 1

Accuracy of the regression of protein levels against 5′ mRNA hairpin stability or against 5′ mRNA hairpin stability and codon frequencies
Data setLinear regression, hairpin stability only (%)SVM, hairpin stability + codon frequencies (%)M5′, hairpin stability + codon frequencies (%)
Full (n = 154)38.665.056.7
No strong hairpins (n = 109)18.553.040.4
Open in a separate windowThe cross-validation correlation coefficient squared (Q2) is compared with the full Kudla et al. data set (154 proteins) and the reduced data set (109 proteins) where mRNA hairpin folding energies are ≥ −10.9 kcal/mol, the 5th percentile of natural E. coli genes.In addition to measuring protein levels in the 154-sequence data set, Kudla et al. performed an additional experiment where an unstructured 28-codon tag was fused to 5′ ends of 72 (of 154) GFP sequence variants. Adding the tag was found to enhance protein levels, supporting the conclusion of Kudla et al. that 5′ structure of mRNA had a strong influence on protein production. After an analysis of the data, we found (see File S3) that data from this specific experiment are not well suited to serve as a direct verification of our existing M5′ and SVM regression models. Still, we can compare the protein level predictions of our existing SVM model on the same set of sequences before and after adding the unstructured tag. We found that the predicted expression levels have increased for 67 of 72 sequences (Table S3) after adding the tag that fixes 5′ mRNA folding energy at a weak −6.1 kcal/mol, a result consistent with the Kudla et al. experiment. Additionally, we have trained a new SVM regression model only on the tagged 72-sequence set (See File S2) and found that, within this set, SVM regression can again predict GFP levels solely from codon usage (5′ mRNA structure is invariant among these sequences) at Q2 = 37.7%. This amount of variance is similar, or even somewhat larger than, the difference in the variance explained by mRNA vs. mRNA+codons (38.6% vs. 65.0%) in the original data. Therefore, codon usage is of similar importance in shaping the protein levels within the tagged 72-sequence set, as it was in the original 154-sequence set.

mRNA 5′ end secondary structure stabilities do not correlate with protein levels for natural E.

coli genes: To further verify our proposed model, we analyzed the relative contributions of mRNA hairpin stabilities and CU on expression levels of natural E. coli genes (See File S2). If the hairpin stabilities were limiting for expression in the range of folding free energies spanned by the E. coli mRNAs, one would expect to see a correlation between the free energy of mRNA 5′ end folding and the abundance of the corresponding protein. We found no such correlation using the folding free energies of the [−4,37] mRNA region (Figure 3) or equal-sized regions centered around the start codon at [−20,21] or on the expected location of a Shine–Dalgarno sequence (Shultzaberger et al. 2001) at [−30,11] (see Figure S3). Unsurprisingly, CAI correlated well with protein levels (Figure 3) in all examined experimental data sets (Lopez-Campistrous et al. 2005; Lu et al. 2007; Ishihama et al. 2008). Therefore, within the boundaries of the mRNA folding free energies spanned by E. coli genes, the CU plays a dominant role in shaping gene expression (or the CU may possibly be shaped by the expression; see Concluding remarks). As for the stronger mRNA hairpins with < −11 kcal/mol, they are present in the Kudla et al. data, but are very rare in the E. coli genome, which could be explained by one of two scenarios: (i) Above a certain threshold, the mRNA hairpin stability may become so detrimental to expression that all the mutants having such hairpins are subject to very strong negative selection and therefore are absent from the genome. And/or (ii) the Kudla et al. data set may not be representative of the genes in the E. coli genome or the mutational processes they undergo; for example, the amino acid sequence of the GFP''s beginning might be unusually conducive to forming RNA hairpins. Unless further analyses prove differently, it seems reasonable to surmise that in natural E. coli genes mRNA secondary structures would shape expression if they were highly stable, consistent with the finding of a universal (albeit not particularly strong) trend toward avoidance of 5′ mRNA structures in genomes (Gu et al. 2010). However, it can also be concluded at this point—and with more confidence—that at lower secondary structure stabilities the CU has an overarching influence on expression. Such a model of expression-related gene sequence determinants in E. coli is fully consistent with our interpretation of the M5′ regression tree that we have derived from the Kudla et al. data.Open in a separate windowFigure 3.—Correlations between the E. coli absolute protein abundances measured in three independent experiments (Lopez-Campistrous et al. 2005; Lu et al. 2007; Ishihama et al. 2008) and the codon adaptation index (CAI) or the free energy of folding of a secondary structure in the mRNA [−4,37] region (in kcal/mol; more negative values denote a more stable RNA secondary structure). “ρ” is the Spearman''s rank correlation coefficient.

Concluding remarks:

We argue that Kudla et al. worked with a set of gene sequences in which strong mRNA secondary structures (that effectively abolished expression) were frequent enough to mask the relevance of codon frequencies on protein levels when examined only with linear regression methods. While mRNA secondary structures can certainly occur when designing synthetic genes, it is highly questionable to what extent Kudla et al.''s conclusion that CU is of little importance for expression would be generally valid for biotechnological applications, especially since we have shown that the influence of CU is nevertheless present even in the Kudla et al. data. What is beyond doubt, however, is that a strong 5′ mRNA secondary structure can be a roadblock in heterologous expression, and therefore the synthetic gene variants harboring such structures should be avoided. The more specific rules regarding the exact location of the hairpin on the gene sequence, the hairpin''s length, or the tolerable levels of folding free energy will have to be established by further experimentation.A recent algorithm for estimating the efficiency of ribosomal binding sites from the mRNA sequence (Salis et al. 2009) explicitly takes into account the folding free energy of RNA secondary structures, along with other factors. When protein overexpression is desired, the conclusions of Welch et al. and (by our reanalysis) the Kudla et al. data indicate that CU should be optimized in addition to the ribosome binding site sequence to ensure that both initiation and elongation phases of translation are free of impediments.On the basis of their results, Kudla et al. also discuss the evolutionary link between the CU of natural genes and the expression levels of proteins for which they code. They propose that selection for translational efficiency acts at a global level in cells; the codons that accelerate elongation would be preferred in a highly expressed gene not because they facilitate production of that particular protein, but to free up ribosomes for the rate-determining initiation phase of translation of the total cellular mRNA pool. Effectively, the flow of causality between CU and expression would be reversed in comparison to the established view. This hypothesis should be critically reevaluated because it depends on the assertion that manipulating a gene''s CU cannot cause protein levels to increase, an assertion poorly supported by the Kudla et al. data.  相似文献   

16.
17.
Caffeic acid O-methyltransferase (COMT) is a bifunctional enzyme that methylates the 5- and 3-hydroxyl positions on the aromatic ring of monolignol precursors, with a preference for 5-hydroxyconiferaldehyde, on the way to producing sinapyl alcohol. Lignins in COMT-deficient plants contain benzodioxane substructures due to the incorporation of 5-hydroxyconiferyl alcohol (5-OH-CA), as a monomer, into the lignin polymer. The derivatization followed by reductive cleavage method can be used to detect and determine benzodioxane structures because of their total survival under this degradation method. Moreover, partial sequencing information for 5-OH-CA incorporation into lignin can be derived from detection or isolation and structural analysis of the resulting benzodioxane products. Results from a modified derivatization followed by reductive cleavage analysis of COMT-deficient lignins provide evidence that 5-OH-CA cross couples (at its β-position) with syringyl and guaiacyl units (at their O-4-positions) in the growing lignin polymer and then either coniferyl or sinapyl alcohol, or another 5-hydroxyconiferyl monomer, adds to the resulting 5-hydroxyguaiacyl terminus, producing the benzodioxane. This new terminus may also become etherified by coupling with further monolignols, incorporating the 5-OH-CA integrally into the lignin structure.Lignins are polymeric aromatic constituents of plant cell walls, constituting about 15% to 35% of the dry mass (Freudenberg and Neish, 1968; Adler, 1977). Unlike other natural polymers such as cellulose or proteins, which have labile linkages (glycosides and peptides) between their building units, lignins’ building units are combinatorially linked with strong ether and carbon-carbon bonds (Sarkanen and Ludwig, 1971; Harkin, 1973). It is difficult to completely degrade lignins. Lignins are traditionally considered to be dehydrogenative polymers derived from three monolignols, p-coumaryl alcohol 1h (which is typically minor), coniferyl alcohol 1g, and sinapyl alcohol 1s (Fig. 1; Sarkanen, 1971). They can vary greatly in their composition in terms of their plant and tissue origins (Campbell and Sederoff, 1996). This variability is probably determined and regulated by different activities and substrate specificities of the monolignol biosynthetic enzymes from different sources, and by the carefully controlled supply of monomers to the lignifying zone (Sederoff and Chang, 1991).Open in a separate windowFigure 1.The monolignols 1, and marker compounds 2 to 4 resulting from incorporation of novel monomer 15h into lignins: thioacidolysis monomeric marker 2, dimers 3, and DFRC dimeric markers 4.Recently there has been considerable interest in genetic modification of lignins with the goal of improving the utilization of lignocellulosics in various agricultural and industrial processes (Baucher et al., 2003; Boerjan et al., 2003a, 2003b). Studies on mutant and transgenic plants with altered monolignol biosynthesis have suggested that plants have a high level of metabolic plasticity in the formation of their lignins (Sederoff et al., 1999; Ralph et al., 2004). Lignins in angiosperm plants with depressed caffeic acid O-methyltransferase (COMT) were found to derive from significant amounts of 5-hydroxyconiferyl alcohol (5-OH-CA) monomers 15h (Fig. 1) substituting for the traditional monomer, sinapyl alcohol 1s (Marita et al., 2001; Ralph et al., 2001a, 2001b; Jouanin et al., 2004; Morreel et al., 2004b). NMR analysis of a ligqnin from COMT-deficient poplar (Populus spp.) has revealed that novel benzodioxane structures are formed through β-O-4 coupling of a monolignol with 5-hydroxyguaiacyl units (resulting from coupling of 5-OH-CA), followed by internal trapping of the resultant quinone methide by the phenolic 5-hydroxyl (Ralph et al., 2001a). When the lignin was subjected to thioacidolysis, a novel 5-hydroxyguaiacyl monomer 2 (Fig. 1) was found in addition to the normal guaiacyl and syringyl thioacidolysis monomers (Jouanin et al., 2000). Also, a new compound 3g (Fig. 1) was found in the dimeric products from thioacidolysis followed by Raney nickel desulfurization (Lapierre et al., 2001; Goujon et al., 2003).Further study with the lignin using the derivatization followed by reductive cleavage (DFRC) method also confirmed the existence of benzodioxane structures, with compounds 4 (Fig. 1) being identified following synthesis of the authentic parent compounds 9 (Fig. 2). However, no 5-hydroxyguaiacyl monomer could be detected in the DFRC products. These facts imply that the DFRC method leaves the benzodioxane structures fully intact, suggesting that the method might therefore be useful as an analytical tool for determining benzodioxane structures that are linked by β-O-4 ethers. Using a modified DFRC procedure, we report here on results that provide further evidence for the existence of benzodioxane structures in lignins from COMT-deficient plants, that 5-OH-CA is behaving as a rather ideal monolignol that can be integrated into plant lignins, and demonstrate the usefulness of the DFRC method for determining these benzodioxane structures.Open in a separate windowFigure 2.Synthesis of benzodioxane DFRC products 12 (see later in Fig. 6 for their structures). i, NaH, THF. ii, Pyrrolidine. iii, 1g or 1s, benzene/acetone (4/1, v/v). iv, DIBAL-H, toluene. v, Iodomethane-K2CO3, acetone. vi, Ac2O pyridine.  相似文献   

18.
19.
20.
Sylvain Glémin 《Genetics》2010,185(3):939-959
GC-biased gene conversion (gBGC) is a recombination-associated process mimicking selection in favor of G and C alleles. It is increasingly recognized as a widespread force in shaping the genomic nucleotide landscape. In recombination hotspots, gBGC can lead to bursts of fixation of GC nucleotides and to accelerated nucleotide substitution rates. It was recently shown that these episodes of strong gBGC could give spurious signatures of adaptation and/or relaxed selection. There is also evidence that gBGC could drive the fixation of deleterious amino acid mutations in some primate genes. This raises the question of the potential fitness effects of gBGC. While gBGC has been metaphorically termed the “Achilles'' heel” of our genome, we do not know whether interference between gBGC and selection merely has practical consequences for the analysis of sequence data or whether it has broader fundamental implications for individuals and populations. I developed a population genetics model to predict the consequences of gBGC on the mutation load and inbreeding depression. I also used estimates available for humans to quantitatively evaluate the fitness impact of gBGC. Surprising features emerged from this model: (i) Contrary to classical mutation load models, gBGC generates a fixation load independent of population size and could contribute to a significant part of the load; (ii) gBGC can maintain recessive deleterious mutations for a long time at intermediate frequency, in a similar way to overdominance, and these mutations generate high inbreeding depression, even if they are slightly deleterious; (iii) since mating systems affect both the selection efficacy and gBGC intensity, gBGC challenges classical predictions concerning the interaction between mating systems and deleterious mutations, and gBGC could constitute an additional cost of outcrossing; and (iv) if mutations are biased toward A and T alleles, very low gBGC levels can reduce the load. A robust prediction is that the gBGC level minimizing the load depends only on the mutational bias and population size. These surprising results suggest that gBGC may have nonnegligible fitness consequences and could play a significant role in the evolution of genetic systems. They also shed light on the evolution of gBGC itself.GC-BIASED gene conversion (gBGC) is increasingly recognized as a widespread force in shaping genome evolution. In different species, gene conversion occurring during double-strand break recombination repair is thought to be biased toward G and C alleles. In heterozygotes, GC alleles undergo a kind of molecular meiotic drive that mimics selection (reviewed in Marais 2003). This process can rapidly increase the GC content, especially around recombination hotspots (Spencer et al. 2006), and, more broadly, can affect genome-wide nucleotide landscapes (Duret and Galtier 2009a). For instance, it is thought to play a role in shaping isochore structure evolution in mammals (Galtier et al. 2001; Meunier and Duret 2004; Duret et al. 2006) and birds (Webster et al. 2006). Direct experimental evidence of gBGC mainly comes from studies in yeast (Birdsell 2002; Mancera et al. 2008; but see Marsolier-Kergoat and Yeramian 2009) and humans (Brown and Jiricny 1987). However, associations between recombination and the nucleotide landscape and frequency spectra biased toward GC alleles provide indirect evidence in very diverse organisms (
OrganismsDirect evidenceIndirect evidenceAchille''s heel evidenceReferences
YeastMeiotic segregation biasMancera et al. (2008)
Mitotic and mitotic heteromismatch correction biasCorrelation between GC and recombinationBirdsell (2002)
MammalsMitotic heteromismatch correction biasBrown and Jiricny (1987)
Correlation between GC*/GC and recombinationDuret and Arndt (2008); Meunier and Duret (2004)
Biased frequency spectrum toward GC allelesGaltier et al. (2001); Spencer et al. (2006)
GC bias associated with high dN/dS near recombination hotspotBerglund et al. (2009; Galtier et al. (2009)
BirdsCorrelation between GC and recombinationInternational Chicken Genome Sequencing Consortium (2004)
TurtlesCorrelation between GC and chromosome sizeKuraku et al. (2006)
DrosophilaCorrelation between GC and recombinationMarais et al. (2003)
Biased frequency spectrum toward GC allelesGaltier et al. (2006)
NematodesCorrelation between GC and recombinationMarais et al. (2001)
GrassesCorrelation between GC and outcrossing/selfingGlémin et al. (2006)
Correlation between GC* and recombination and outcrossing/selfingOutcrossing increases dN/dS for genes with high GC*Haudry et al. (2008)
Green algaeCorrelation between GC and recombinationJancek et al. (2008)
ParameciumCorrelation between GC and chromosome sizeDuret et al. (2008)
Open in a separate windowThe impact of gBGC on noncoding sequences and synonymous sites has been studied in depth, especially because of confounding effects with selection on codon usage (Marais et al. 2001). More recently, Galtier and Duret (2007) pointed out that gBGC may also interfere with selection when affecting functional sequences. They argued that gBGC could leave spurious signatures of adaptive selection and proposed to extend the null hypothesis of molecular evolution. Indeed, gBGC can lead to a ratio of nonsynonymous (dN) over synonymous (dS) substitutions above one (Berglund et al. 2009; Galtier et al. 2009), i.e., a typical signature of positive selection (Nielsen 2005). This hypothesis has been widely debated for human-accelerated regions (HARs). These regions are extremely conserved across mammals but show evidence of accelerated evolution along the human lineage, which has been interpreted as evidence of positive selection (Pollard et al. 2006a,b; Prabhakar et al. 2006, 2008). On the contrary, other authors argued that patterns observed in HARs, such as the AT → GC substitution bias, the absence of a selective sweep signature, or the propensity to occur within or close to recombination hotspots, are more likely explained by gBGC rather than positive selection (Galtier and Duret 2007; Berglund et al. 2009; Duret and Galtier 2009b; but see also Pollard et al. 2006a who also suggested that gBGC might play a role in HARs evolution). It is thus crucial to take gBGC into account when interpreting genomic data.Moreover, Galtier and Duret (2007) initially suggested that gBGC hotspots could contribute to the fixation of slightly deleterious AT → GC mutations and could represent the Achilles'' heel of our genome. This hypothesis was reinforced later in primates, with evidence of gBGC-driven fixation of deleterious mutations in proteins (Galtier et al. 2009). A similar result was also found in some grass species, whose genomes are also supposed to be affected by gBGC (Glémin et al. 2006). Haudry et al. (2008) compared two outcrossing and two selfing grass species and showed that GC-biased genes exhibit higher dN/dS ratio in outcrossing than in selfing lineages. The reverse pattern would be expected under pure selective models because of the reduced selection efficacy in selfers (Charlesworth 1992; Glémin 2007). This pattern is in agreement with a genomic Achilles'' heel associated with outcrossing, while gBGC is inefficient in selfing species because they are mainly homozygous.Twenty years ago, Bengtsson (1990) already pointed out that biased conversion can generally affect the mutation load. The mutation load is the reduction in the mean fitness of a population due to mutation accumulation, which could lead to population extinction if it is too high (Lynch et al. 1995). At this time, Bengtsson concluded that “it is impossible to know if biased conversion plays a major role in determining the magnitude of the mutation load in organisms such as ourselves, but the possibility must be considered and further investigated (Bengtsson 1990, p. 186).” Now, one can propose gBGC could be such a widespread biased conversion process. It thus appears timely to thoroughly investigate the fitness consequences of gBGC through its potential effects on the dynamics of deleterious mutations. The fitness consequences of gBGC were also pointed out as a major future issue to be addressed by Duret and Galtier (2009a). In addition to the load, deleterious mutations have many other evolutionary consequences (for review see Charlesworth and Charlesworth 1998). They are thought to be the main determinant of inbreeding depression, i.e., the reduction in fitness of inbred individuals compared to outbred ones. They also play a key role in the evolution of genetic systems (sexual reproduction and recombination, inbreeding avoidance mechanisms, ploidy cycles), of senescence, or in the degeneration of nonrecombining regions, such as Y chromosomes. So far, we know little, if anything, about how gBGC might affect these processes.In his seminal work, Bengtsson (1990) did not address several important points. First, he did not include genetic drift in his model. Nearly neutral mutations, for which drift and selection are of similar intensities, are the most damaging ones because they can drift to fixation, unlike strongly deleterious mutations that are maintained at low frequency (Crow 1993; Lande 1994, 1998). While gBGC intensities are rather weak (Birdsell 2002; Spencer et al. 2006), they could markedly affect the fate of nearly neutral mutations (see also Galtier et al. 2009). Second, Bengtsson did not study the effect of gene conversion on inbreeding depression, while he showed that recessive mutations, mostly involved in inbreeding depression, are the most affected by gene conversion. Third, he did not envisage systematic GC bias with its opposite effects on A/T and G/C deleterious alleles. Fourth, while he noted that selfing affects both the efficacy of selection and that of conversion, he did not fully investigate the effect of mating systems. On one hand, selfing is efficient in purging strongly deleterious mutations causing inbreeding depression. However, since selfing is expected to increase drift, weakly deleterious mutations can fix in selfing species, contributing to the so-called “drift load” (Charlesworth 1992; Glémin 2007). Self-fertilizing populations are thus expected to exhibit low inbreeding depression and high drift load. On the other hand, gBGC, and thus its cost, vanishes as the selfing rate and homozygosity increase (Marais et al. 2004). gBGC could thus challenge classical views on mating systems and it was even speculated that gBGC could affect their evolution (Haudry et al. 2008).Here I present a population genetics model that includes mutation, selection, drift, and gBGC, which extends previous studies (Gutz and Leslie 1976; Lamb and Helmi 1982; Nagylaki 1983a,b; Bengtsson 1990). I specifically examine how gBGC can affect inbreeding depression and the mutation load. I also focus on the effect of mating system, which is especially interesting with regard to the interaction between biased conversion and selection. Finally, I discuss how these results could give insight into how gBGC evolved.

Impacts of gBGC on inbreeding depression:

Inbreeding depression is defined as the reduction in fitness of selfed (and more generally inbred) individuals compared to outcrossed individuals,(15)where and are the mean fitness of outcrosses and selfcrosses, respectively (Charlesworth and Charlesworth 1987; Charlesworth and Willis 2009). The approximation is very good in most conditions, because under weak (s ≪ 1) and strong selection (x ≪ 1) (see Glémin et al. 2003). Similar to the load, considering both sites for which either S or W alleles are deleterious, in proportion q and 1 – q, respectively, we get(16)
gBGC and the genetic basis of inbreeding depression in panmictic populations:
In infinite panmictic populations without gBGC, inbreeding depression depends only on mutation rates and dominance levels. Partially recessive mutations () contribute only to inbreeding depression, and the more recessive they are, the higher the inbreeding depression (Charlesworth and Charlesworth 1987). In finite populations, deterministic results hold for strongly deleterious mutations (s ≫ 1/Ne), which contribute mostly to inbreeding depression. Contrary to the load, weakly deleterious mutations (∼s ≤ 1/Ne) contribute little to inbreeding depression (Figure 4, a and c, and see Bataillon and Kirkpatrick 2000).Open in a separate windowFigure 4.—Inbreeding depression (×106) as a function of s without (a and c) or with (b and d) gBGC (b = 0.0002). (a and b) h = 0.2: thick lines, N = 5000; thin lines, N = 10,000; dashed lines, N = 50,000; dotted lines, N = 100,000. (c and d) N = 10,000: thick lines, h = 0.4; thin lines, h = 0.2; dashed lines, h = 0.1; dotted lines, h = 0.05. u = 10−6, λ = 2.Like the load, gBGC affects both the magnitude and the structure of inbreeding depression. In infinite populations, and more generally for strongly deleterious alleles (Nes ≫ 1), replacing x by xeq given by Equations 4 in Equations 15 and 16 leads to(17a)(17b)(17c)The effect of gBGC on inbreeding depression is not monotonic. Like the load, gBGC increases inbreeding depression if b > hs(1 − 2q/(q + λ − qλ)). However, contrary to the load, a strong gBGC decreases inbreeding depression, which tends to 0 as b increases, while the load tends to qs (Equation 10c). An analysis of Equation 17b shows that mutations that maximize inbreeding depression are those that also maximize the load, i.e., S deleterious mutations with s ≈ 2b.In finite populations, inbreeding depression must be integrated over the Φ distribution, which leads to(18)(see also Glémin et al. 2003). While it is not possible to get an analytical expression of (18), numerical computations (see appendix b) show that S deleterious mutations with s ≈ 2b also maximize inbreeding depression in finite populations (Figure 4). More broadly, inbreeding depression is maximal under the overdominant-like selection regime (gray area in Figure 2). Once again, even low to moderate gBGC markedly affects the genetic structure of inbreeding depression. First, mutations of intermediate effects contribute the most to inbreeding depression, i.e., up to one order of magnitude higher than strongly deleterious mutations (compare Figure 4a with 4b). Second, even nearly additive mutations can have a substantial effect (compare Figure 4c with 4d).Since little is known about the distribution of dominance coefficients, especially the dominance of mildly deleterious mutations (of the order of b), it is difficult to quantitatively predict the full impact of gBGC on inbreeding depression. We can conclude that, on average, gBGC should increase inbreeding depression. However, further insight into mutational parameters is crucial to assess the quantitative impact of gBGC.

Joint effect of gBGC and mating system on the load and inbreeding depression:

Selfing, or more generally inbreeding, slightly reduces the segregating load through the purging of recessive mutations (Ohta and Cockerham 1974), but can substantially increase the fixation load because of the effective population size reduction under inbreeding: (see above and Pollak 1987; Nordborg 1997; Glémin 2007). In numerical examples, I assumed that α decreases with F according to the background selection model (Charlesworth et al. 1993; Nordborg et al. 1996), as in Glémin (2007). With gBGC, selfing thus has two opposite effects on the fixation load. Selfing increases the drift load sensu stricto but decreases the fixation load due to gBGC. A surprising consequence is that the load can be higher in outcrossing than in selfing populations (Figure 5). Quantitatively this is also expected, even with a gBGC hotspot affecting just 3% of the genome (Figure 5 and Open in a separate windowFigure 5.—Effective population size (a and b) and the load (×106) (c–f) as a function of F for different gBGC intensities (thick lines, b = 0; thin lines, b = 0.0001; dashed lines, b = 0.0002; dotted lines, b = 0.0005). The effective population size depends on F under the background selection (BS) model (Charlesworth et al. 1993), using Equations 16 and 17 in Glémin (2007): , where U is the genomic deleterious mutation rate, R is the genomic recombination rate, sd is the mean selection coefficient against strongly deleterious mutations, and hd is their dominance coefficient. N = 10,000, U = 0.2, hd = 0.1, and sd = 0.05. (a, c, and e) R = 5, “weak” BS; (b, d, and f) R = 0.5, “strong” BS. (c and d) Load averaged over half GC and half AT deleterious alleles, with a bias in favor of AT alleles. (e and f) Load averaged over 10% of GC deleterious alleles and 90% of AT deleterious alleles with a bias in favor of AT alleles; see Figure 3. h = 0.5, u = 10−6, and λ = 2.Generally, the effect of selfing is simpler for inbreeding depression. Purging, Ne reduction, and suppression of gBGC contribute to decreasing inbreeding depression in selfing populations (Figure 6a). However, there are special cases in which maximum inbreeding depression is reached for intermediate selfing rates (Figure 6b). In such cases, in outcrossing populations, gBGC is strong enough to sweep polymorphism out and reduce inbreeding depression (b > s, regime 1 in Figure 2). As the selfing rate increases, gBGC declines, and the selection dynamics become overdominant-like (regime 2, Figure 2), thus maximizing inbreeding depression. For high selfing rates, gBGC vanishes (regime 3 in Figure 2) and deleterious alleles are either purged or fixed if there is substantial drift. This is similar to the effect of selfing on inbreeding depression caused by asymmetrical overdominance, where inbreeding depression also peaks for intermediate selfing rates (Ziehe and Roberds 1989; Charlesworth and Charlesworth 1990). In the present case, the range of parameters leading to this peculiar behavior is narrow because the overdominant-like region depends on the selfing rates and can vanish either for low or for high selfing rates (Figure 2).Open in a separate windowFigure 6.—Inbreeding depression (×106) as a function of F for different gBGC intensities (thick lines, b = 0; thin lines, b = 0.0001; dashed lines, b = 0.0002; dotted lines, b = 0.0005). Inbreeding depression is averaged over half GC and half AT deleterious alleles. The effective population size depends on F as in Figure 5 (same parameters). (a) s = 0.002; (b) s = 0.0005; (c) s = 0.0002. h = 0.2, u = 10−6, and λ = 2.

Minimum load and the evolution of gBGC and recombination landscapes:

Although gBGC may have deleterious fitness consequences, it is surprising that it evolved in many taxa (Duret and Galtier 2009a). Birdsell (2002) initially suggested that gBGC may have evolved as a response to mutational bias toward AT (λ > 1, here). Indeed, I show that a minimum load is reached for weak gBGC (b ≈ ln(λ)/4N, Equation 14). This result is very general whatever the distribution of fitness effects of mutations (appendix d). However, the range of optimal gBGC is narrow, and gBGC increases the load as far as b > ln(λ)/2N (appendix c). In humans, using N = 10,000 and λ = 2, gBGC levels that minimize the load are ∼1.17 × 10−5, i.e., one order of magnitude lower than the average bias observed in recombination hotspots (Myers et al. 2005). However, selection on conversion modifiers will not necessarily minimize the load because of gametic disequilibrium generated between modifiers and fitness loci (Bengtsson and Uyenoyama 1990). Selection for limitation of somatic AT-biased mutations could also have selected for GC-biased mismatch repair machinery (Brown and Jiricny 1987). If the bias level that would be selected for somatic reasons is >ln(λ)/2N, a side effect would be the generation of a substantial load at the population level. Finally, it is interesting to note that when synonymous codon positions are under selection for translation accuracy, optimal gBGC levels can be higher than gBGC levels that minimize the protein load, especially when most optimal codons end in G or C ().Conversely, gBGC could also affect the evolution of recombination landscapes, which could evolve to reduce the gBGC load. Surprisingly, for a given recombination/conversion level, the hotspot distribution does not appear to be optimal (Nishant and Rao 2005), one can speculate that the hotspot localization outside genes could be a response to avoid the deleterious effects of gBGC.Up to now, these verbal arguments have not been assessed theoretically (but see Bengtsson and Uyenoyama 1990 for a different kind of conversion bias). Population genetics models are necessary to test these hypotheses concerning the evolution of gBGC and recombination landscapes and to pinpoint the key parameters that might govern their evolution.

gBGC and the evolution of mating systems:

Deleterious mutations also play a crucial role in the evolution of mating systems. They are the main source of inbreeding depression, which balances the automatic advantage of selfing. The drift load is also thought to contribute to the extinction of selfing species. Since they are mainly homozygous, selfing species are mostly free from gBGC and its deleterious impacts. I discuss below how this might affect the evolution of mating systems.
Inbreeding depression and the shift in mating systems:
Inbreeding depression plays a key role in the evolution of mating systems (Charlesworth and Charlesworth 1987; Charlesworth 2006b). Since it balances the automatic advantage of selfing, high inbreeding depression favors outcrossing, while selfing can evolve when it is low. Moreover, selfing helps to purge strongly deleterious mutations, thus decreasing inbreeding depression. This positive feedback reinforces the disruptive selection on the selfing rate and prevents the transition from selfing to outcrossing (Lande and Schemske 1985).Theoretical results suggest that, in most conditions, gBGC would reinforce inbreeding depression in outcrossing populations (Figure 6), which would prevent the evolution of selfing. In reverse, if selfing is initially selected for, recurrent selfing would reduce the load through both purging and avoidance of gBGC. Under this scenario, gBGC would reinforce disruptive selection on mating systems. However, under some conditions (see Figure 6), inbreeding depression peaks at intermediate selfing rates, as observed for asymmetrical overdominance (Ziehe and Roberds 1989; Charlesworth and Charlesworth 1990). In theory, this could prevent the shift toward complete selfing and maintain stable mixed mating systems (Charlesworth and Charlesworth 1990; Uyenoyama and Waller 1991). However, this pattern is observed under restrictive conditions and it is very unlikely on the whole-genome scale. Dominance patterns are crucial for predicting inbreeding depression, especially with gBGC. Contrary to the load, it is thus difficult to evaluate the quantitative impact of gBGC on inbreeding depression. However, increased inbreeding depression in outcrossing species subject to gBGC seems to be the most likely scenario.
gBGC and the long-term evolution of mating systems:
In the long term, the gBGC-induced load also challenges the “dead-end hypothesis,” which posits that, because of the reduction of selection efficacy, self-fertilizing species would accumulate weakly deleterious mutations in the long term, eventually leading to extinction (Takebayashi and Morrell 2001). Because of gBGC, not drift, outcrossing species could also accumulate a load of weakly deleterious mutations (Figure 7), and they could suffer from a higher load than highly self-fertilizing species (Haudry et al. (2008) found that in two outcrossing grass species, but not in two self-fertilizing ones, the dN/dS ratio is significantly higher for genes exhibiting GC enrichment. They speculated that substitutions in these genes might contribute to increasing the load in these two outcrossing grass species. Such results are still very sparse. In plants, evidence of strong gBGC is mainly restricted to grasses (but see Wright et al. 2007). It will be necessary to conduct more in-depth studies to assess the phylogenetic distribution of gBGC in plants and other hermaphrodite organisms and to further test the genomic Achilles'' heel hypothesis in relation to mating systems. While theoretically possible, the quantitative effect of gBGC on the evolution of mating systems remains a new, open, and challenging question.

Conclusion:

I showed that the interaction between gBGC and selection might have surprising qualitative consequences on load and inbreeding depression patterns. Given the few quantitative data available on gBGC levels and selection intensities (mainly in humans), it turns out that even weak genome-wide gBGC can have significant fitness impacts. gBGC should be taken into account not only for sequence analyses (Berglund et al. 2009; Galtier et al. 2009), but also for its potential fitness consequences, for instance concerning genetic diseases. Interferences between gBGC and selection also give rise to new questions on the evolution of mating systems. However, most of the challenging conclusions given here have yet to be quantitatively evaluated. Quantification of gBGC and its interaction with selection in various organisms will be crucial in the future.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号