Glycobiology Advance Access originally published online on January 29, 2007
Glycobiology 2007 17(6):35R-56R; doi:10.1093/glycob/cwm010
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REVIEW |
The glycosyltransferases of Mycobacterium tuberculosisroles in the synthesis of arabinogalactan, lipoarabinomannan, and other glycoconjugates
3 Department of Microbiology, Immunology, and Pathology, Colorado State University, Fort Collins, CO 80523
4 Unité de Génétique Mycobactérienne, Institut Pasteur, 75724 Paris Cedex 15, France
1 To whom correspondence should be addressed; Tel: +1-970 491 6700; Fax: 970 491 1815; e-mail: patrick.brennan{at}colostate.edu
Received on October 20, 2006; revised on January 22, 2007; accepted on January 23, 2007
| Abstract |
|---|
|
|
|---|
Several human pathogens are to be found within the bacterial genus Mycobacterium, notably Mycobacterium tuberculosis, the causative agent of tuberculosis, one of the most threatening of human infectious diseases, with an annual lethality of about two million people. The characteristic mycobacterial cell envelope is the dominant feature of the biology of M. tuberculosis and other mycobacterial pathogens, based on sugars and lipids of exceptional structure. The cell wall consists of a peptidoglycan-arabinogalactan-mycolic acid complex beyond the plasma membrane. Free-standing lipids, lipoglycans, and proteins intercalate within this complex, complement the mycolic acid monolayer and may also appear in a capsular-like arrangement. The consequences of these structural oddities are an extremely robust and impermeable cell envelope. This review reflects on these entities from the perspective of their synthesis, particularly the structural and functional aspects of the glycosyltransferases (GTs) of M. tuberculosis, the dominating group of enzymes responsible for the terminal stages of their biosynthesis. Besides the many nucleotide-sugar dependent GTs with orthologs in prokaryotes and eukaryotes, M. tuberculosis and related species of the order Actinomycetales, in light of the highly lipophilic environment prevailing within the cell envelope, carry a significant number of GTs of the GT-C class dependent on polyprenyl-phosphate-linked sugars. These are of special emphasis in this review.
Key words: Mycobacterium / glycosyltransferase / classification / arabinogalactan / lipoarabinomannan
| Introduction |
|---|
|
|
|---|
Glycosylation events are among the most common and important enzymatic reactions in nature. Still, the responsible glycosyltransferases (GTs) are in general poorly understood. The gathering of information about GTs is a difficult task mainly because the enzymes are unstable, often membrane-associated, and present in the cell in very low concentrations. In bacteria, the majority of GTs is involved in the synthesis of, e.g., glycolipids, peptidoglycan (PG), and lipooligosaccharides (LOSs)essential components of the cell envelopeand can thus be suitable targets for drug development against bacterial pathogens. In Mycobacterium tuberculosis, some of the first-line tuberculosis (TB) drugs target cell wall synthesis, but their specific targets and mechanisms of inhibition are not well defined. Together with the serious problem of drug resistance, particularly multidrug resistant-TB, a better understanding of mycobacterial cell wall biosynthesis is required in order to elucidate the targets of existing drugs and to find new ones. In this context, the many uncharacterized GTs of M. tuberculosis are of particular interest. This review summarizes current information on characterized and putative GTs in Mycobacterium spp. Extra attention has been given to a dozen open reading frames (ORFs) recently proposed as polyprenyl-dependent GTs (Liu and Mushegian 2003
| The known and putative roles of GTs in mycobacterial cell wall biosynthesis |
|---|
|
|
|---|
The envelope of Mycobacterium spp. is a source of unique carbohydrates. A complex, consisting of mycolic acids, the heteropolysaccharide arabinogalactan (AG), and PG, constitutes "the core" of the cell wall (Figure 1) (reviewed in Crick et al. 2001
-glucans (Daffé and Draper 1998
ová et al. 1995
|
Details of the structures of the two dominating heteropolysaccharides of mycobacterial cell wall, AG and LAM, are shown in Figure 1. It is well established that the reducing end of AG consists of a Rha-
1,3-GlcNAc disaccharide, which is attached in phosphodiester linkage to some of the muramic acids of PG (McNeil et al. 1990
1,5-linked Araf with several
1,3-linked branch points has been suggested as the core structure (Daffé et al. 1990
-D-Araf] constitute attachment sites for mycolic acids. Together with PG, AG forms a substantial covalently linked network located between the plasma membrane and the mycolic acid layer. These barriers in concert make the mycobacterial cell wall extremely robust and difficult to penetrate.
Unlike AG, LAM is a noncovalently linked component of the cell envelope and may be anchored in the plasma membrane and/or in the mycolic acid layer, or both, via its phosphatidyl-myo-inositol (PI) unit. The reducing end of LAM (Figure 1) shares structural similarities with the PI-mannosides (PIMs; Figure 2) in that the inositol residues of the PI of both the PIMs and LAM are mannosylated at the 2- and the 6-positions. The mannan of LAM is an extension of the PIMs and it is composed of a linear
1,6-linked mannan backbone, frequently branched with single
1,2-linked mannoses, leading to a mannan of about 2025 residues (Chatterjee et al. 1991
; Khoo et al. 1996
). The arabinan of LAM is endowed with a D-arabinan structure more variable than that in AG. A recent study of the LAM-arabinan of M. smegmatis suggested the occurence of an Ara18 motif (Figure 1B) resembling the internal structure of AG-arabinan. However, the length of the terminal extensions linked at the nonreducing ends of this motif seem to vary in LAM-arabinan (Shi et al. 2006
). Each LAM carries about 5080 Araf residues (Khoo et al. 1996
), but little is known about the number of arabinan chains attached to the mannan core of LAM. An important feature of LAM of pathogenic species of mycobacteria (members of the M. tuberculosis complex, M. leprae, and M. avium) is the presence of "Man-caps" (hence the name Man-LAM) consisting of single
-Man or short mannooligosaccharides attached to the expected ß1,2-Araf termini of the D-arabinan; these units were shown to promote the binding and entry of these mycobacteria into antigen-presenting cells through the C-type lectins, mannose receptor and DC-SIGN (Schlesinger et al. 1994
; Geijtenbeek et al. 2003
; Maeda et al. 2003
; Koppel et al. 2004
). In most other mycobacteria, the nonreducing termini of LAM either lack Man-caps or are sporadically modified by inositol-phosphate units, as is the case in M. smegmatis (Nigou et al. 2003
).
|
Although the chemical composition of mycobacterial cell walls and the chemistry of individual components are well understood, biosynthetic pathways are only now being defined, aided considerably by the sequences of mycobacterial genomes (Cole et al. 1998
|
Biosynthesis of sugar donors
The biosynthetic pathways for most of the nucleotide-sugar donors in M. tuberculosis have strong analogy to those of other prokaryotic species or have been characterized (Figure 4), notably UDP-
-D-GlcpNAc, UDP-
-D-Glcp, UDP-
-D-Galp, and GDP-
-D-Manp; UDP-
-D-Galf which is an isomeric product of UDP-
-D-Galp is synthesized by the well characterized UDP-
-D-Galp mutase (Glf) (Weston et al. 1997
-form in the remaining NDP-sugars listed above.
|
C50-P, not the usual undecaprenyl-phosphate (Mahapatra et al. 2005
ová et al. 2005
GTs involved in PG synthesis
PG synthesis in M. tuberculosis has been assumed to be similar to that of Escherichia coli (van Heijenoort 2001
). However, PG of mycobacteria carries a variety of novel modifications (Mahapatra et al. 2005
, and references therein). For instance, in M. tuberculosis and M. smegmatis, the muramic acid residues contain a mixture of the N-acetyl and N-glycolyl derivatives, a modification suggested to take place after the synthesis of the UDP-muramyl-NAc but before the formation of the UDP-muramyl-pentapeptides. The subsequent syntheses of Lipid I and Lipid II have been defined in E. coli. MraY transfers MurNAc-pentapeptide to undecaprenyl-phosphate and the product, Lipid I, is then further glycosylated by MurG through the use of UDP-GlcNAc to form Lipid II (Figure 3) (Ikeda et al. 1991
). Proteins with strong homology to MraY and MurG of E. coli have been annotated in the M. tuberculosis genome (Cole et al. 1998
) but their enzymatic activity has not yet been confirmed. It is believed that Lipid II is translocated to the periplasmic side of the plasma membrane where the glycopeptide moiety of Lipid II is polymerized in the final assembly of PG by activities of two putative penicillin-binding proteins (Rv0050 and Rv3682). These bifunctional proteins carry transglycosylase and transpeptidase activities and are, beside MurG, the only implicated GTs in the PG pathway (Figure 3) (Bhakta and Basu 2002
).
GTs involved in linkage unit and galactan synthesis
The synthesis of the "linkage unit", on which the AG is assembled, is initiated by a transfer of GlcNAc-phosphate from UDP-GlcNAc to the acceptor C50-P (Figure 3). This activity has been associated with Rv1302, due to its significant homology to WecA of E. coli, a well characterized GlcNAc-phosphotransferase (Amer and Valvano 2002
). A Rha residue is then added in an
1,3 configuration by the recently described rhamnosyltransferase, WbbL (Rv3265c), to complete the linkage unit (Mills et al. 2004
). Subsequent galactan synthesis involves at least two galactosyltransferases (GalTs) with specificity for the Galf donor, UDP-Galf. The transfer of the first Galf is most probably performed by a GalT that is specifically designed to recognize the Rha residue in the linker unit and to create a ß1,4 linkage. Rv3782 is the best-known candidate for this initial catalytic step (Miku
ová et al. 2006
). Several studies strongly support the principle that GlfT (Rv3808c) is a bifunctional GalT catalyzing the arrangement of the two differently linked Galf in the formation of linear galactan (Miku
ová et al. 2000
; Kremer et al. 2001
; Rose et al. 2006
).
GTs involved in synthesis of the arabinans of AG and LAM
The variety of glycosidic linkages in both types of D-arabinans implies that the biosynthetic pathways should involve several arabinosyltransferases (AraTs). Realizing that the arabinans of AG and LAM are composed of D-Araf with its origin solely in C50-P-Araf (Wolucka et al. 1994
), those AraTs have to be dependent on this lipid-linked sugar donor. Thus far, the only candidates shown to be involved in arabinan synthesis are the Emb proteins (EmbA, EmbB, and EmbC) (Belanger et al. 1996
) and AftA (Rv3792) (Alderwick et al. 2006
). The Emb proteins play a key but largely undefined role in the synthesis of the arabinan components of both AG and LAM, with EmbA and EmbB contributing to AG synthesis (Escuyer et al. 2001
), whereas EmbC is involved in the synthesis of LAM (Zhang et al. 2003
). A more comprehensive discussion on the Emb proteins and their function in arabinan biosynthesis follows in section The Emb proteins and their relationship to other GTs. Disruption of the ortholog of aftA in C. glutamicum and use of a cell-free assay based on the recombinant M. tuberculosis enzyme provided evidence that AftA catalyzes the addition of the first Araf residue from C50-P-Araf to the galactan domain of AG (Alderwick et al. 2006
). Additional potential AraTs may be found among the putative polyprenyl-dependent GTs listed in Table II (section Polyprenyl-dependent GTs of M. tuberculosis).
|
ManTs involved in synthesis of LAM
The structural description of the PI-mannosides, spanning from PIM1 to PIM6 in different acylated forms (Lee and Ballou 1965
1,6-linked mannan backbone of LM and LAM. However, such a pathway should include a branch point at PIM4, one direction leading to LM/LAM, the other, through addition of two consecutive
1,2-linked Manp residues, resulting in PIM6, an apparent dead-end product, not involved in LM and subsequent LAM synthesis (Figure 2) (Morita et al. 2004
Early work had proposed that mannosylation of the more polar/mannosylated PIMs and LM involve both GDP-Man and C50-P-Man (Yokoyama and Ballou 1989
). However, inhibition studies with amphomycin were shown to inhibit the synthesis of PIM4, PIM5, and PIM6, suggesting that these enzymatic steps utilize C50-P-Man as a donor substrate (Morita et al. 2004
). Indeed, PimE (Rv1159) was recently identified as a probable C50-P-Man-dependent ManT responsible for the formation of PIM5 from PIM4 (Morita et al. 2006
). Whether PimE also transfers the sixth mannose to form both PIM5 and PIM6 remains to be determined. Our laboratory recently created the M. smegmatis mutant
MSMEG4250 (ortholog to Rv2181 of M. tuberculosis), lacking
1,2-linked Manp on the mannan backbone of LAM, strongly suggesting this protein to be an
1,2-ManT in the synthesis of mature LAM (Kaur et al. 2006
). The phenotype of mutant
MSMEG4250, which completely lacked LM but still produced a truncated form of LAM, has raised new speculations about the biosynthesis of mannan of LM/LAM. An earlier hypothesis of a "straight" pathway, in which linear LM served as the substrate for a branching enzyme leading to the formation of mature LM, subsequently used in LAM biosynthesis (Besra et al. 1997
), has now been complemented with alternative routes (Kaur et al. 2006
). The most innovative one suggests that smaller C50-P-linked mannooligosaccharides are being synthesized by Rv2181 and (an)other GT(s) and then used for chain extension at the nonreducing end of PIM4 to form mature LM (Kaur et al. 2006
). Future experiments should clarify these new ideas. Another ManT implicated in LAM biosynthesis of M. tuberculosis is Rv1635c, which recently was shown to be responsible for transferring the first Manp residue in Man-capping of ManLAM (Dinadayala et al. 2006
). Both Rv2181 and Rv1635c have been proposed as polyprenyl-dependent GTs (see also section Polyprenyl-dependent GTs of M. tuberculosis), and characterization as such would be in accordance with the hypothesis stated earlier.
Additional GTs involved in synthesis of other glycoconjugates of Mycobacterium spp.
Trehalose is a precursor for the synthesis of the trehalose-containing LOSs, TMM, TDM, and several methyl-branched fatty acid-containing glycolipids such as sulfatides, and di-, tri-, and poly-acyltrehaloses. Mycobacteria have three alternative routes for trehalose synthesis (Figure 4) (De Smet et al. 2000
), including the classical condensation of UDP-Glc and Glc-6-phosphate leading to an
,
-1,1-glycosidic linkage in the product, trehalose-6-phosphate. The glucosyltransferase involved, Rv3490 (OtsA), has been identified in M. tuberculosis (Pan et al. 2002
), as has the phosphatase Rv3372 (OtsB2) which is responsible for dephosphorylation of the trehalose-6-phosphate (Murphy et al. 2005
). In addition to this pathway, trehalose can be interconverted from maltose by trehalose synthase (TreS) (Nishimoto et al. 1996
), or from glycogen involving the two enzymes TreY and TreZ (Maruta et al. 1996
). Of these three pathways, the OtsAB-dependent route was recently shown to be predominant in M. tuberculosis, and OtsB2 shown to be essential for growth (Murphy et al. 2005
).
It has been proposed that mycolic acids can be transferred via a mycolyl-mannosylphosphoheptaprenol (Besra et al. 1994
) to trehalose-6-phosphate, arising from the OtsAB pathway, to yield phosphorylated TMM, which then can be dephosphorylated to yield TMM (Takayama et al. 2005
). The screening of a transposon mutant library of Corynebacterium matruchotii for mutants with defects in corynemycolic acid synthesis led Wang et al. to propose that a putative polyprenyl-dependent GT, orthologous to Rv1459c of M. tuberculosis (section Polyprenyl-dependent GTs of M. tuberculosis), and a neighboring ATP-binding cassette (ABC) transporter were somehow involved in this process (Wang et al. 2006
). However, all of this has to be proven since the intermediate, mycolyl-mannosylphosphoheptaprenol, has so far only been found in M. smegmatis (Besra et al. 1994
). The subsequent synthesis of TDM, from two TMM molecules, and the transfer of mycolates to the nonreducing ends of AG have been show to involve a protein complex composed of antigens 85A, 85B, and 85C (Wiker and Harboe 1992
; Belisle et al. 1997
; Jackson et al. 1999
).
Other carbohydrate-containing components in the cell envelope of M. tuberculosis are the phthiocerol/PGLs and p-hydroxybenzoic acid derivatives (p-HBADs), of which the PGLs were shown to be important virulence factors (Reed et al. 2004
; Tsenova et al. 2005
). Their biosyntheses involve the sequential addition of the three basic sugars, Rha, Rha, and Fuc catalyzed by the three GTs Rv2957, Rv2958c, and Rv2962c and several methylation steps, two of which are catalyzed by the methyltransferases Rv2952 and Rv2959c (Figure 5) (Perez, Constant, Lemassu, et al. 2004; Perez, Constant, Laval, et al. 2004). The PGLs of M. tuberculosis, M. leprae, and other species (reviewed in Brennan 1988
) should not be confused with other mycobacterial 6-O-methyl hexose-containing cell wall glycolipids such as the glycopeptidolipids (GPLs), found in M. avium and M. smegmatis (Brennan et al. 1981
; Daffé et al. 1983
). The Rha moiety of the GPLs of M. avium is transferred by the characterized rhamnosyltransferase RtfA (Eckstein et al. 1998
), and its strong ortholog, Gtf3, was recently suggested to carry an equivalent GT activity in M. smegmatis (Deshayes et al. 2005
).
|
Other specific mycobacterial glycoconjugates are the 6-O-methyl glucose-containing lipopolysaccharides (MGLPs) of M. tuberculosis, M. bovis BCG, M. smegmatis, M. phlei, M. xenopi, and M. leprae (Lee 1966
1,4-ManT activity, with specificity for GDP-Man, has been characterized in the biosynthesis of MMPs (Weisman and Ballou 1984
The cell wall "capsule" of mycobacteria contains D-arabinomannan similar to the arabinomannan domain of LAM,
-D-mannan, and a branched
-D-glucan (Lemassu and Daffé 1994
; Ortalo-Magne et al. 1995
). Glucan consists of linear
1,4-linked glucosyl residues occasionally substituted at position 6 with mono-, di-, tri-, tetra-, penta-, or hexa-glucosyl residues and thereby shares structural features with the glycogen stored in the cytosol of mycobacteria (Dinadayala et al. 2004
). This similarity in structure suggests that they may use the same biosynthetic pathway.
| Classification and structural aspects of GTs from M. tuberculosis |
|---|
|
|
|---|
Families and structural superfamilies of GTs
The catalytic mechanisms described for sugar transfer reactions leads to either inversion or retention of the anomeric sugar binding. The enzymatic formation of an
- or ß-glycosidic bond is consequently determined by (i) the mechanism used by the enzyme and (ii) the anomeric configuration of the donor substrate (Sinnott 1990
|
Despite the fact that the catalytic mechanism used by enzymes within a GT family is consistent, families within a superfamily can use different mechanisms. It has been suggested that similar structural elements are employed in families having the same fold, irrespective of the stereochemistry of the glycosylation reaction (Persson et al. 2001
Classification of putative GTs of M. tuberculosis
It is difficult to estimate a rational number of GTs needed for biosynthesis of the complex cell wall structure of mycobacteria. However, GTs seem to be the largest group of enzymes involved in synthesis of the mycobacterial cell envelope. According to CAZy, among the approximately 3900 ORFs found in the genome of M. tuberculosis H37Rv (Cole et al. 1998
), about 41 ORFs (approximately 1%) encode putative GTs. The majority of these classified GTs of M. tuberculosis is proposed to have a requirement for NDP-sugar donors, and they belong to families that have representatives in all kingdoms of life. In contrast, the polyprenyl-dependent GTs of M. tuberculosis are more confined to the order of Actinomycetales and mostly form their own GT families. Besides the CAZy classification, Wimmerova et al. used fold recognition analysis to study the genome of M. tuberculosis and found another 15 proteins with predicted similarity to the structural fold of GT-A and GT-B (Wimmerova et al. 2003
). However, none of them has yet been shown to be a GT by biochemical means.
Most of the characterized and uncharacterized GT genes listed in Tables I and II are evenly distributed on the M. tuberculosis H37Rv chromosome. However, there are at least two obvious GT-containing gene clusters, each holding nine proposed GT genes; one is located in the region of Rv1500 to Rv1526c and the other spans from Rv3779 to Rv3809c (Figure 7). The former cluster contains mostly GTs proposed to utilize NDP-sugars but their functions are still unidentified, except for LosA (Rv1500) which recently was annotated as a GT involved in LOS biosynthesis [(Burguiere et al. 2005
); see section Sugar-nucleotide dependent GTs of M. tuberculosis]. The latter GT-containing gene cluster, on the other hand, has been described earlier as "the cell wall biosynthetic cluster" (Cole et al. 1998
; Belanger and Inamine 2000
) and includes several now characterized proteins implicated in AG, LAM, and mycolic acid biosynthesis (Miku
ová et al. 2000
). Interestingly, as many as five ORFs in this cluster are possibly polyprenyl-P-sugar dependent GTs (Table II) which is in agreement with the hypothesized use of C50-P-Araf and C50-P-Man in the pathways of AG and LAM (Figure 3). Noteworthy, a third smaller cluster with three proposed GT genes (Rv2174, Rv2181, and Rv2188c) is located downstream of the fts/mur-gene cluster (Rv2150c-2158c), which carries a large number of genes involved in PG biosynthesis and cell division. Rv2181 was recently shown to be a ManT in LM biosynthesis (Kaur et al. 2006
).
|
|
Sugar-nucleotide dependent GTs of M. tuberculosis
Comparisons of GTs of mycobacterial origins to GTs of known X-ray structures at the level of amino acid sequence and predicted secondary structure can help us towards structural and functional understanding of these enzymes. To explore such relationships, we investigated the amino acid sequences of all of the ORFs of M. tuberculosis H37Rv (listed in Tables I and II) that had been proposed to belong to any of the three superfamilies, GT-A, GT-B, or GT-C. Most of the classified GTs of M. tuberculosis are proposed as NDP-sugar dependent, and they belong to the families GT-1, GT-2, GT-4, GT-20, GT-28, and GT-35, listed in Table I. Of these six families, GT-2 is the only one with GT-A fold; the remaining five belong to superfamily GT-B. Herein follows a closer examination of the properties of these GT families.
GTs of M. tuberculosis with a proposed GT-A fold. The characteristic fold of superfamily GT-A is here represented by SpsA (Figure 6), a GT-2 protein involved in spore formation of Bacillus subtilis. GT-2 enzymes use inverting mechanism, generally leading to a glycosidic bond in ß-configuration, and carry a DxD motif found to bind a divalent cation as part of immobilization of the sugar-nucleotide donor (Charnock and Davies 1999
). As many as 16 ORFs of M. tuberculosis H37Rv (Table I) are members of GT-2 and they are consequently proposed to share the GT-A fold. So far, five of them (LosA, Ppm1, WbbL, Rv3782, and GlfT) have been biochemically characterized. The four that are highly conserved in the sequenced genomes of M. leprae, M. smegmatis, C. glutamicum, and Nocardia farcinica, Ppm1, WbbL, Rv3782, and GlfT, have been characterized as key enzymes in AG and LM synthesis. LosA (the ortholog of Rv1500 in M. marinum) was shown to be involved in LOS biosynthesis (see further below) (Burguiere et al. 2005
).
The resolved crystal structure of SpsA showed that at least three invariant residues, Asp39, Asp98, and Asp99 in the N-terminal domain (Figure 6; Table I), are involved in binding of the donor substrate. Furthermore, Asp191 in the C-terminal domain (not shown here) may play a role in the catalytic event (Charnock and Davies 1999
). The selected alignment of SpsA with the 16 GT candidates of M. tuberculosis shows that conserved Asp residues can be identified at comparable positions, and always within the two loops following Nß2 and Nß4 of SpsA (Table I). Also, predicted secondary structures in this region (not shown) suggest high similarity to the X-ray structure determined for SpsA, implying that these 16 proteins of H37Rv carry a NDP-sugar binding site, with a structure similar to that of SpsA.
Rv3786c, which has not yet been classified by CAZy but shares the characteristic features of GT-2 proteins, is located in the cell wall biosynthetic cluster discussed earlier (Figure 7). We therefore suggest that it may be functioning as an additional GalT in the galactan synthesis of AG, in conjunction with the two characterized GalTs, Rv3782 and GlfT (Rv3808c). The latter contains over 600 residues and is thereby significantly larger than other members of the GT-2 family, which commonly have 200400 amino acids (Table I). However, its bifunctional role in galactan synthesis (Figure 3) (Miku
ová et al. 2000
) is a plausible reason for its extended size.
Unlike most of the GT-2 enzymes, WbbL is known to generate a product with the glycosidic bond in
-configuration. However, it is in fact an inverting enzyme since it utilizes dTDP-ß-Rha in concert with an inverting mechanism. Rv2957 is another GT of this family that may function similarly. A recent study and the restricted distribution of Rv2957 to mycobacterial species producing fucosylated forms of PGLs and p-HBADs support the idea that this enzyme catalyzes the transfer of Fuc in the synthesis of these molecules (Perez, Constant, Lemassu, et al. 2004). The transferred sugar is in
-configuration (Figure 5A). Thus, with Rv2957 classified as an inverting enzyme, the most likely substrate is GDP-ß-Fuc. One of the most conserved GTs in the GT-2 family is Ppm1, a polyprenyl-P-Man synthase (Gurcha et al. 2002
). However, this protein probably has dual functionality in that it carries two domains. Its N-terminal domain contains seven predicted TM segments and shows high sequence and topology similarities to an apolipoprotein N-acyltransferase recently characterized in E. coli (Robichon et al. 2005
), while its soluble C-terminal domain carries the ManT activity (Baulard et al. 2003
). Interestingly, these two domains are expressed as two different proteins in other mycobacteria (Gurcha et al. 2002
). Rv0539 has 25% amino acid identity to the soluble C-terminal domain of Ppm1 and may also function as a polyprenyl-P-Man synthase. If that is the case, these two genes may complement for each other since neither of them have been classified as essential (Sassetti et al. 2003
).
As many as six ORFs of the large GT cluster, Rv1500-Rv1526c (Figure 7), are in accord with the properties of the GT-2 family. Interestingly, none of those H37Rv proteins are highly conserved among Actinomycetales spp., suggesting involvement in glycosylations of either components specific to M. tuberculosis or glycoconjugates with high structural variability among mycobacterial spp., such as the trehalose-based LOS components (Brennan 1988
). A strong homolog to Rv1500, LosA of M. marinum (76% amino acid identity), was recently suggested as a GT involved in LOS biosynthesis (Burguiere et al. 2005
) but the sugar donor used by LosA was not identified. Both Rv1500 and LosA diverge from most GTs of the GT-2 family by being predicted to have TM domains, located in the C-terminus of the protein. A very interesting comparison can be drawn to GtrB, a polyprenyl-P-Glc synthase expressed by Shigella flexneri bacteriophage SfX. The lipid-linked sugar synthesized by GtrB is utilized in O-antigen glucosylation leading to a serotype specific S. flexneri (Guan et al. 1999
). GtrB shares 31% amino acid identity with Rv1500 and a topology study has shown that GtrB actually contains two TM domains in the C-terminus (Korres et al. 2005
). Analogous functions are thereby likely for Rv1500 and LosA, and they may catalyze the formation of polyprenyl-P-hexose (from polyprenyl-P and NDP-hexose) then used in a coupled reaction in LOS biosynthesis.
GTs of M. tuberculosis with a proposed GT-B fold. The five families GT-1, GT-4, GT-20, GT-28, and GT-35 contain some 15 M. tuberculosis ORFs (Table I). Resolved crystal structures of at least one representative from each family have shown that all belong to the GT-B superfamily; GtfD from Amycolatopsis orientalis (aoGtfD; GT-1), OtsA from E. coli (ecOtsA; GT-20), MurG from E. coli (ecMurG; GT-28), and MalP from E. coli (ecMalP; GT-35) display the characteristic features of a GT-B structure (Watson et al. 1999
; Ha et al. 2000
; Gibson et al. 2002
; Mulichak et al. 2004
). In the absence of a published structure for GTs of the GT-4 family, a distant relationship between GT-4 and GT-28 [exemplified in Table I by AceA from Acetobacter xylinum (axAceA)] allowed for structural modeling of GT-4 enzymes into the GT-B fold (Abdian et al. 2000
; Edman et al. 2003
). This proposed classification of the GT-4 family has now been confirmed with the recent determination of the three-dimensional structure of PimA from M. smegmatis (msPimA; GT-4). MsPimA has all of the expected features of GTs from the GT-B superfamily being organized into two ß
ß Rossmann fold domains, with a deep fissure at the interface that includes the catalytic center (Guerin et al. Forthcoming).
A comparison between the NDP-sugar binding regions in these GT-B families shows that the structural elements are very similar in spite of the different catalytic mechanisms they employ. The two inverting enzymes, aoGtfD and ecMurG of the GT-1 and GT-28 families, respectively, contain a loop between ß-strand Cß4 and
-helix C
4, which is particularly noteworthy (Table I). His309 of aoGtfD and Arg261 of ecMurG belong to this loop and may play a role in catalysis (Ha et al. 2001
; Mulichak et al. 2004
). Arg261 is part of the motif [K/R]X7E which is strictly conserved among members of the GT-28 family. In comparison, the related motif [D/E]X7E is conserved among retaining enzymes of the GT-4 and GT-20 families in an analogous structural element, as found in ecOtsA and predicted for axAceA (Table I) (Abdian et al. 2000
; Gibson et al. 2002
). The importance of this motif among GT-4 enzymes has been shown by site-directed mutagenesis. Exchange of the first glutamic acid in this motif in axAceA and msPimA (E287A and E274A, respectively), resulted in a complete loss of activity of these
-ManTs (Abdian et al. 2000
; Guerin et al. Forthcoming). These results, and studies of others (Yep et al. 2004
), suggest that an acidic residue in the first position of this motif is essential and may play a catalytic role in enzymes of the GT-4 and GT-20 families.
M. tuberculosis has five ORFs classified into the GT-1, a family characterized by enzymes using the inverting mechanism. As most of them use
-linked donors, it leads to sugar transfers in ß-configuration. However, this family also includes many GTs with specificity for ß-linked donors, such as dTDP-ß-Rha and TDP-ß-vancosamine, and consequently gives rise to
-linked products. The five proteins from M. tuberculosis within this GT-1 classification seem also to be using ß-linked donors. Two of them, Rv2958c and Rv2962c, are functioning in the biosynthesis of PGLs and p-HBADs (Perez, Constant, Lemassu, et al. 2004) and are likely to use dTDP-ß-Rha as a donor substrate (Figure 5). The remaining three (Rv1524, Rv1526c, and Rv2739) may also utilize ß-linked sugar donors as they share 2560% amino acid identity with the rhamnosyltransferase RtfA of M. avium. Since RtfA acts in the synthesis of GPLs (Eckstein et al. 1998
), which so far have not been found in M. tuberculosis, other but similar functions should be designated for these RtfA homologs. In particular, Rv1524 and Rv1526c share significant over-all similarity (approximately 25% amino acid identity) and several motifs with aoGtfD, the latter responsible for vancomycin synthesis. The sequence stretch of aoGtfD shown in Table I contains the motif H308HGSAGT, including the proposed catalytic residue His309 (Mulichak et al. 2004
). This motif is conserved in Rv1524 and Rv1526c, suggesting a strong functional relationship. Further investigations may clarify the connection between aoGtfD and these putative GTs from M. tuberculosis.
The retaining GT-4 family is the largest family of GTs responsible for the formation of anomeric bindings in
-configurations. This family contains many bacterial GTs involved in synthesis of cell envelope structures, such as lipopolysaccharides and capsular polysaccharides (http://afmb.cnrs-mrs.fr/CAZY). M. tuberculosis contains seven representatives of this class, including the two ManTs PimA (Rv2610c) and PimB (Rv0557) involved in PIM biosynthesis (Schaeffer et al. 1999
; Kordulakova et al. 2002
). A third identified protein of the GT-4 type is MshA (Rv0486), a GlcNAc-inositol-phosphate synthase which catalyzes the first step in mycothiol biosynthesis, important for cellular detoxification of thiol-reactive agents (Newton and Fahey 2002
; Newton et al. 2006
). MshA is so far the only identified GT in M. tuberculosis not associated with cell wall biosynthesis.
It is likely that the remaining enzymes of the GT-4 type (Rv0225, Rv1212c, Rv2188c, Rv3032) are involved in biosynthesis of LAM, glycogen,
-glucan, and methylated (lipo)-polysaccharides of mycobacteria, since these molecules solely or essentially contain glycosidic bonds in
-configurations. A study on the ortholog of Rv1212c in C. glutamicum (Tzvetkov et al. 2003
) and our preliminary data on Rv1212c and Rv3032 from M. tuberculosis actually suggest that these GTs act as
1,4-glucosyltransferases in the synthesis of glucan/glycogen and MGLPs, respectively (Jackson et al. unpublished data). All glycogen synthases from archaea, prokaryotic and eukaryotic origin classified to date in CAZy belong to either the GT-3 or the GT-5 family, both suggested to share the GT-B structural fold. Surprisingly, M. tuberculosis has no GT belonging to either of these two families. Instead the putative glycogen synthase, Rv1212c, is a GT-4 protein. This unexpected finding further emphasizes the structural similarities between these three families of GTs. Apart from Rv0557 and Rv1212c, which lack orthologs in M. leprae, and Rv3032 which lacks strong orthologs in corynebacteria, all other GT-4 proteins of M. tuberculosis H37Rv are well conserved in the four genomes evaluated in this study (Table I).
Three other families of GTs, namely GT-20, GT-28, and GT-35, are represented by only one ORF each in M. tuberculosis (Table I). The enzymes that catalyze synthesis of trehalose-6-phosphate (OtsA) constitute their own family of GT-20 enzymes. Rv3490 belongs to this family and has already been characterized as a trehalose-phosphate synthase (Pan et al. 2002
). The GT-28 family includes many glycolipid-synthesizing GTs such as MurG, an essential bacterial GT that catalyzes the synthesis of Lipid II. In M. tuberculosis, Rv2153c is a very strong candidate as MurG, since it shares the typical motifs, including RX7E, and a sequence identity of 33% to ecMurG (Ha et al. 2000
). Enzymes of the GT-35 family function as glycogen or starch phosphorylases and do not utilize NDP-sugar but instead degrade
1,4-linked glucans to Glc-1-phosphate. Rv1328 (GlgP) shares high homology to this family and has been proposed to be involved in metabolism of either glycogen or
-glucan, or both (Schneider et al. 2000
). Homologs of MurG, OtsA, and GlgP are well conserved among the four species of Actinomycetales compared here, except for the latter (GlgP), which again lacks an ortholog in M. leprae (Table I). Incidentally, the two proposed glycogen/glucan biosynthesizing enzymes, Rv1212c and Rv1328, are pseudo genes in M. leprae in which no glycogen or glucan have so far been reported.
GTs of GT-B fold with membrane binding properties. Many of the GTs implicated in cell wall biosynthesis must be membrane-associated, particularly those with their substrates located in the cytoplasmic membrane. One such enzyme is MurG (GT-28) of E. coli, and its resolved crystal structure (Figure 6) suggested that the binding site of its acceptor substrate (Lipid I) is located in the N-terminal domain (Ha et al. 2000
). This domain was shown to contain an amphipathic helix surrounded by basic residues that can create both hydrophobic and electrostatic interactions with the lipid bilayer, and thereby facilitate binding of the lipid acceptor. Interestingly, the surplus of basic amino acids in the N-terminal domain of MurG is reflected in the basic value of its calculated isoelectric point (pI), while the C-terminal domain has a neutral pI value (Edman et al. 2003
). These features are also applicable to the MurG homolog of M. tuberculosis (Table III). Because enzymes of the GT-4 family share the structural characteristics of MurG, some of these enzymes may also share the membrane binding properties of MurG. Edman et al. used this proposed similarity to predict the surface charge distribution of lipid synthesizing enzymes of GT-4 (Edman et al. 2003
). In M. tuberculosis, PimA (Rv2610c) and PimB (Rv0557) have been shown to carry ManT activities in the synthesis of glycolipids PIM1 and PIM2 (Schaeffer et al. 1999
; Kordulakova et al. 2002
), and these enzymes are therefore most likely associated to the cytoplasmic membrane. The basic pI of the N-terminal domains of PimA and PimB is consistent with this hypothesis (Table III). In contrast, MshA and OtsA are, due to their role in mycothiol and trehalose biosynthesis, proposed as soluble proteins without membrane association. That is in line with their low or neutral pI in the N-terminal domain. Likewise for Rv1212c and Rv3032 that are proposed to participate in the synthesis of glycogen/
-glucan and MGLPs. Rv0225 and Rv2188c, however, share the property of high pI in the N-terminal domain with PimA and PimB and can thus be proposed as membrane-associated GTs (Table III).
|
Polyprenyl-dependent GTs
General features. It has long been established that the classical protein glycosylation pathways, leading to the assemblies of asparagine-linked glycans (ALGs or N-glycan) and phosphatidylinositol glycan (PIG) anchors at the endoplasmatic reticulum (ER) of eukaryotes, utilize not only nucleotide-sugar donors but also polyprenyl-P-linked sugars (Lennarz 1975
All 11 families of polyprenyl-dependent GTs consist of integral membrane proteins having 813 predicted TM domains. The sequence homology between them is in general very low, but conserved amino acid motifs have been found (Oriol et al. 2002
). In common is a modified DxD motif (e.g., DxE, ExD, DDx, DEx, or EEx), typically located in the first or the second predicted extracytoplasmic loop. The position of this motif and a similar topology pattern among these polyprenyl-dependent GTs has suggested that they are structurally related, and therefore they have been organized into a superfamily named GT-C (Figure 6) (Liu and Mushegian 2003
). The importance of the modified DxD motif has been investigated by site-directed mutagenesis for some of these GT-C proteins; substitution of an aspartic acid in this motif in the human PIG-M (ManT in PIG biosynthesis; Maeda et al. 2001
), in PimE of M. smegmatis (ManT in PIM5 biosynthesis; Morita et al. 2006
), and in EmbC of M. smegmatis (AraT in LAM biosynthesis; Berg et al. 2005
), resulted in all cases in loss or reduction of the enzyme activity. However, the exact function of this motif has not been elucidated. Nevertheless, a comparison can be made to NDP-sugar dependent GTs, many of which carry a DxD motif involved in binding of the donor substrate via a divalent cation (Unligil and Rini 2000
). Thus, the acidic motifs of PIG-M, PimE and EmbC, and the corresponding motifs of other GT-C proteins may be part of a binding site for polyprenyl-P-sugar donors. Interestingly, about 2040 amino acids downstream of the modified DxD motif is an aromatic residue commonly clustered together with one or several prolines (Table II). Furthermore, these residues are next to an additional partially conserved acidic residue (Liu and Mushegian 2003
). These conserved residues, that are part of the same predicted loop in the GT-C proteins, will here be referred to as "the GT-C motif" and they may constitute elements important for binding of a lipid-linked sugar donor and/or for catalytic activity.
Polyprenyl-dependent GTs of M. tuberculosis. The recognition in the past of the lipid-linked sugar donors C50-P-Araf, C50-P-Man, and C50-P-Glc (section Biosynthesis of sugar donors) has revealed that mycobacteria and related species should be endowed with GT-activities dependent on these substrates. Early on, the three Emb proteins were suggested to have such activity (Belanger et al. 1996
) and the






