Glycobiology Advance Access originally published online on January 19, 2007
Glycobiology 2007 17(5):23R-34R; doi:10.1093/glycob/cwm005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REVIEW |
Evolution of carbohydrate antigensmicrobial forces shaping host glycomes?
2 Glycobiology Research and Training Center, Cellular and Molecular Medicine-East, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0687
1 To whom correspondence should be addressed; e-mail: pgagneux{at}ucsd.edu
Received on December 5, 2006; revised on January 10, 2007; accepted on January 10, 2007
| Abstract |
|---|
Many glycans show remarkably discontinuous distribution across evolutionary lineages. These differences play major roles when organisms belonging to different lineages interact as hostpathogen or hostsymbiont. Certain lineage-specific glycans have become important signals for multicellular host organisms, which use them as molecular signatures of their pathogens and symbionts through recognition by a toolkit of innate defense molecules. In turn, pathogens have evolved to exploit host lineage-specific glycans and are constantly shaping the glycomes of their hosts. These interactions take place in the face of numerous critical endogenous functions played by glycans within host organisms. Whether due to simple evolutionary divergence or adaptive changes under natural selection resulting from endogenous functional requirements, once different lineages elaborate on differential glycomes these mutual differences provide opportunities for host exploitation and/or pathogen defense between lineages. Such phylogenetic molecular recognition mechanisms will augment and likely contribute to the maintenance of lineage-specific differences in glycan repertoires.
Key words: glycan / co-evolution / host-pathogen / animal lectin
| Introduction |
|---|
Carbohydrates makeup a substantial portion of the biomass on earth, mostly in the form of the two structural polysaccharidescellulose from plants and chitin from arthropods and fungi. All known living organisms also display an array of free or covalently attached carbohydrates collectively known as glycans (Varki et al. 1999
Despite recent advances, we are yet to have a complete inventory of naturally occurring monosaccharides used to produce the glycan portion of these molecules, as many members of the Bacteria and Archaea domains synthesize a number of specialized carbohydrates (Schaffer et al. 2001
). In contrast, metazoan animals build most of their glycans from a very limited number of monosaccharide building blocks, allowing us to consider how these molecules might have evolved over time. Most metazoan glycoconjugates are built from six classes of monosaccharides including sialic acids, hexoses, hexosamines, deoxyhexoses, pentoses, and uronic acids (Varki et al. 1999
) see Box 1. These monosaccharides can of course be modified to create greater complexity at the single monosaccharide level. Furthermore, the individual carbohydrate units can be attached via a variety of glycosidic linkages, into highly complex linear or branched structures. Thus in theory, there is virtually no limit to the number of different glycans that can be generated. In practice though metazoan animals seem to generate only a limited range of these possibilities.
It would be impossible to do justice to the overwhelming diversity of natural glycans and their functions in one review. Fortunately, a number of excellent recent reviews and texts address the biology of individual classes of glycans as well as their endogenous ligands, the glycan-binding animal lectins (Staudacher et al. 1999
; Angata and Varki 2002
; Esko and Selleck 2002
; Spiro 2002
; Lowe and Marth 2003
; Varki and Angata 2006
). The aim of this review is to address the taxonomic distribution of glycans and to reflect on the processes that are shaping this distribution. Our main focus will be on how interactions between multicellular animal hosts and their microbial or viral pathogens as well as symbionts may have contributed to the observed lineage-specific constellations of certain glycans, especially extracellular glycans.
| Distribution of glycans within the tree of life |
|---|
Glycans occur in a discontinuous and puzzling distribution across evolutionary lineages. Examples of discontinuously distributed glycans are presented in Table I. The hypothetical evolutionary relationships of living organisms can be depicted in the form of phylogenetic trees. Figure 1 shows three phylogenies depicting the evolutionary relationships: between the three domains of life (Figure 1A), among Eukarya (Figure 1B) and among the anthropoid primates (Figure 1C), respectively, along with the distribution patterns of selected glycan across different evolutionary lineages. As seen, the distribution patterns of glycans fall into four general patterns.
- Glycans conserved across many taxa. In contrast to ribosomal RNA that is present in all living organisms, thus allowing the reconstruction of these phylogenies, no single glycan structure has been conserved to the same extent. An example for a relatively conserved class of glycan would be N-glycans found in organisms of all the three primary lineages of life, albeit absent from many bacteria (Figure 1A).
- Glycans specific to a particular lineage, such as capsule murein peptidoglycans in bacteria (Figure 1A) or gangliosides in vertebrates (Figure 1B).
- Glycans similar across distant taxa, examples include glycosaminoglycans found in metazoans and bacteria (Figure 1A); cellulose in plants, bacteria and tunicates; sialic acids (long thought to be unique to metazoan animals) of the deuterostome lineage and also found in many bacteria and in cephalopod mollusks (squid and octopus); or Gal(Fuc alpha 14) N-acetylglucosamine (GlcNAc) (Lewis A) only found in primates, some other vertebrates, plants, and few pathogenic bacteria (Figure 1B), and
- Glycans conspicuously absent from very restricted taxa only (species, families, or higher units) within lineages that otherwise possess such glycans. Examples include Gal alpha 14Gal beta14GlcNAc present in most vertebrates but absent in mammals and some birds (Figure 1B); Gal alpha 13 Gal beta 14GlcNAc (alpha-Gal) present in most mammals, but absent in Old World monkeys, apes and humans (Catarrhines), and N-glycolylneuraminic acid (Neu5Gc) present in most vertebrates but absent in humans (Figure 1C).
|
|
| Why do glycans evolve? |
|---|
Divergence
Like that of any biological molecule, glycan evolution is likely to occur simply due to the divergence of evolutionary lineages. Phylogenies (literally: "history of lineages") come about mostly by the successive bifurcation of lineages, as populations derived from a common ancestor cease to exchange genetic information (i.e., become reproductively isolated). See Box 2 for a list of some key evolution terminology. The genetic tool kits responsible for glycan synthesis and modification of different lineages are subsequently shaped by independent mutational histories, causing the glycan repertoires (glycomes) of different lineages to diverge as well. An example would be the use of cellulose in plants but not in metazoans, with the exception of tunicates (Figure 1B). Divergence involves much historical contingency, where random changes in different lineages, such as the recruitment of certain glycan types over others for specific functions, limit the future evolution of their glycomes.
Natural selection
Selective pressures resulting from recognition processes disproportionately affect the glycans covering cell surfaces. Natural selection acts on glycans, either by favoring the maintenance of a particular glycan (stabilizing or purifying selection) or by diminishing survival and/or reproductive success of organisms carrying a certain glycan (negative selection). Maintenance of the N-glycan synthesis pathway in all eukaryotes is an example of stabilizing selection, since disruptions often lead to lethal consequences (Chui et al. 2001
; Schachter 2002
). Negative selection on glycans could occur whenever an important pathogen exploits a particular glycan as a receptor for infection. Positive selection would entail selection for rapid change in glycans e.g., to accommodate novel endogenous functions.
Convergence
Still another mechanism for generating diversity occurs when organisms belonging to distantly related lineages recruit or "reinvent" similar subsets of glycan repertoires. Such parallel events may be due to particular demands of the environment or be due to random recruitment of ancestral synthetic pathways. The existence of the Lewis A antigen [Gal beta 14 (Fuc alpha 14)GlcNAc] in Catarrhines and in plants could be such an example, as the enzymes involved in its synthesis have very different genomic sequences (Palma et al. 2001
; Javaud et al. 2003
) (Figure 1B). Alternatively, what appears as convergent evolution could result from the differential retention of ancestral enzymatic tool kits confined to a few distantly related lineages.
Coevolution
When organisms belonging to different lineages repeatedly interact, as is the case in most natural ecological communities, then their glycomes can become involved in coevolutionary processes. Thus, the interactions of two distinct glycomes of the interacting lineages directly influence their mutual evolution. There is ample evidence for coevolution in glycan diversity in the interactions of microbes and their animal hosts. These cases of coevolution involve two distinct phenomena: (i) independent evolution of enzymatic tool kits for the production of identical molecules in microbes. Examples include glycans found almost exclusively in multicellular hosts and in their microbial pathogens such as glycosaminoglycans and sialic acids (Figure 1B), and (ii) synthesis of "mimic", molecules not identical but very similar to hosts glycans such as polylegionaminic acids by Legionella or pseudaminic acid by Pseudomonas (Knirel et al. 1987
; Kooistra et al. 2001
). The Lewis A antigen is also found in certain strains of Helicobacter pylori, which infect humans, likely reflecting coevolution (Monteiro et al. 1998
). Coevolution could also be occurring via horizontal gene transfer between metazoans and their bacterial pathogens, as has been discussed for genes involved in sialic acid synthesis (Angata and Varki 2002
).
| Disclaimer about limitations of evolutionary research |
|---|
While we would certainly agree with the statement that "nothing in glycobiology makes sense, except in the light of evolution" (Varki 2006
A further limitation arises with regard to glycan changes in rapidly evolving organisms such as microbes or viruses, as it is impossible to gain information from long-extinct pathogens, which leave no fossils. The speed of evolution in pathogens means that the identity of past pathogens will never be known and that many current pathogens may be descendents of earlier innocuous microbes or even former symbionts. Rapid evolutionary rates are also associated with homoplasy, i.e., if the observed similarity between glycans is not necessarily due to recent shared ancestry but could have evolved independently in different lineages (convergence). In the era of genomics, the ability to investigate the genomic sequences of the genes coding for enzymes that assemble and modify glycans in different lineages provides a powerful means of reconstructing the evolutionary history of glycosylation by determining key events in the establishment of glycan synthesis machinery.
| Glycans in metazoan animals |
|---|
In metzoan animals, cell surfaces are covered with an electron dense coating of glycoconjugates known as the glycocalyx. Further, glycans are directly secreted as polymers or attached to proteins into the extracellular matrix and body fluids. This glycan landscape is often (for functional and historical reasons) characteristic of both species and particular cell types. (Paulson and Colley 1989
|
Why vertebrates use only such a small fraction of monosaccharide types for the assembly of their glycans remains a mystery (Box 1). For example, what is the reason why vertebrates, unlike plants do not carry terminal xylose on their N-glycans or incorporate any trehalose in their glycan repertoire? Absences of such structures are likely to represent cases of lineage-specific evolutionary happenstance (contingency), whereby the independent mutational history of different lineages has led to differential evolution of glycan biosynthesis enzymes along separate lineages. Paradoxically, however, even with their relatively reduced panel of monosaccharides (compared to bacteria for example), vertebrates generate a staggering amount of structural variation by combining just nine principal monosaccharides into chains of varying lengths and degrees of branching on differentially decorated proteins and lipids (Manzi et al. 2000
| Box 1. Principal building blocks of vertebrate glycans Sialic acids: e.rg. N-acetylneuraminic acid (Neu5Ac. N-glycolylneuraminic acid (Neu5Gc) Hexoses: Glucose, mannose, galactose (Gal) Hexosamines: N-acetylglucosamine (GlcNAc. N-acetylgalactoseamine GalNAc Deoxyhexoses: Fucose (Fuc) Pentoses: Xylose Uronic acids: Iduronic acid, glucuronic acid
|
| Box 2. Glossary of evolution terminology Antagonistic coevolution: "evolutionary arms race", where changes in one lineage of a pair of host-parasite lineages are prompted by or prompt changes in the other lineage. Convergence: similarity between taxa despite independent evolutionary histories. Catarrhine: primates belonging to Old World monkeys, apes and humans. Demographic bottleneck: strong reduction in population size. Divergence: differences between taxa due to independent evolutionary histories. Domain: one of the three radiations of life including the Archaea, Bacteria, and Eukarya. Founder event: establishment of new populations by small numbers of founder individuals. Frequency dependent selection: when the fitness of a genotype depends on its frequency. Genetic drift: random variation in gene frequency from one generation to another. Historical contingency: the effect of random event on the probability of subsequent events in a lineage. Homoplasy: similarities in character states for reasons other than inheritance from a common ancestor. These include convergence, parallelism, and reversal. Lineage: group of organisms sharing a common ancestor (monophyletic). Phylogeny: hypothetical history of related lineages based on DNA sequences or any other heritable derived traits. Purifying (stabilizing) selection: a type of selection that removes individuals from both ends of a phenotypic distribution thus maintaining the same distribution mean. Trade-off: the balancing of different selection pressures especially when these have opposing directions.
|
| Variation in animals glycan antigens in time and space |
|---|
Transient glycan variation in animals has been documented during key processes such as pregnancy, lactation, infection, or acute phase response, whereas ontogenetic glycan variation plays key roles in the regulation of metazoan development (Haltiwanger and Lowe 2004
| Genes coding for glycan biosynthetic enzymes have undergone substantial expansion contributing to glycan diversity in metazoans |
|---|
A substantial fraction (12%) of animal genes function in glycan biosynthesis and modification. Unlike genes coding for a single protein product, these enzymes work in an "assembly-line" like system of glycan synthesis pathways (Lowe and Marth 2003
Genetic studies in model organisms with null mutations in biosynthesis genes have proved that many glycans are required for proper metazoan development, as these mutations produce phenotypes ranging from embryonal lethality to growth defects to impaired morphogenesis and cognitive functionbut some can also have no obvious effects under laboratory conditions (Natsuka and Lowe 1994
; Kotani et al. 2001
; Lowe and Marth 2003
; Kudo et al. 2006
). It is conspicuous that the consequences of experimental abolition of many glycans are often not evident in animal cell cultures, even when these prove to be lethal as early as the embryonic stage in the whole organism from which the cells are cultured (Grobe et al. 2002
). These findings point to key functions of glycans for multicellular development, but they also leave open the possibility that a certain fraction of animal glycans can be selectively neutral, i.e., these can be altered without incurring major fitness costs to the organism. Laboratory studies looking at consequences of experimental glycan alteration for individuals are unlikely to shed light on population-level effects of glycan polymorphism, such as the proposed protective effects in preventing the rapid spread of pathogens due to herd immunity-related mechanisms (Gagneux and Varki 1999
). This idea remains untested in part because such effects would be based on populations rather than individuals.
| Animal lectin intractions and glycan evolution |
|---|
In many cases, the endogenous function of glycans requires interaction with proteins, and recent decades have seen the discovery of a growing list of animal lectins with specific carbohydrate recognition domains (CRD) (Drickamer and Taylor 1993
| The evolutionary glycan arms race |
|---|
The ubiquitous presence of species-specific glycans on host cells and secretions predispose these as convenient receptors to be exploited by microbes for host recognition, attachment, and invasion by way of a wide array of microbial and viral lectins including adhesins, pili, fimbriae, and hemagglutinins (Gilboa-Garber and Garber 1989
It may seem that rapid glycan structural change in response to pathogenic microbes would be the best route to evade infection, however, change of glycans has the potential of negatively affecting critical endogenous functions or jeopardizing successful interaction with symbionts (see Hostsymbiont coevolution). Given this, we speculate that microbe driven alteration in host glycan structure is more likely if the change minimally affects endogenous function(s) or if the selection is strong enough to outweigh the impaired endogenous function. Similarly, pathogens can evolve to counter-adapt to changes in host glycan structure by altering ligand/receptor specificity. Such antagonistic coevolution (also called "evolutionary arms race") is known to lead to rapid evolutionary change (Buckling and Rainey 2002
). It appears that the ongoing arms race between microbes and their animal hosts is constantly shaping the makeup of glycans of both sides, and such glycan changes must be considered against the background of "normal" glycan variation.
Owing to the observed glycome differences in distant lineages, it is tempting to speculate that glycome differences often represent insurmountable barriers for pathogens of one distant lineage for infecting hosts of another lineage. For example, plants and animals share few terminal glycans and, with one possible exception (Gibbs and Weiller 1999
), there seem to be no plant pathogens that also infect animals or vice versa.
| Glycans as innate markers of nonself |
|---|
Many microbes seem to be affected by lineage-dependent constraints such as the glycan composition of their cell walls. The highly conserved capsule glycans in pathogenic microbes can be exploited by multicellular hosts as "pathogen associated molecular patterns" (PAMPs) or "microbial motifs" and used as target molecules for (Kawabata and Tsuda 2002
Further, by expanding terminal glycan structures which are absent from pathogen lineages, metazoan hosts can recruit these same structures as innate determinants of self. Mammals have evolved 19 sialic acid glycosyltranseferases and utilize the absence of this terminal glycan for the detection of nonself. Lack of sialic acid on any cell surface perturbs factor H binding and allows complement molecules to be deposited on the surface leading to an immune attack (Pangburn et al. 2000
). Simultaneously, a family of endogenous mammalian lectins called Siglecs mediate immune cell functions based on the presence of sialic acid (Crocker and Varki 2001
). Thus, host innate immune systems directly target microbe glycans and readily detect the absence of self-glycans as well (Janeway and Medzhitov 2002
).
| Nonself glycan and adaptive immunity |
|---|
Jawed vertebrates have the capacity to generate virtually unlimited variation of receptors with their adaptive immune systems. This important evolutionary innovation provides these animals with a flexible system, capable of learning (affinity maturation) and experienced-based memory. This innovation is also double edged, as antibodies targeting foreign peptides may cross-react with host epitopes including glycans (Hedrick 2004
| Host symbiont coevolution |
|---|
Metazoans must tolerate huge numbers of microbial (nonself) symbionts. Thus, host immune systems must accommodate "a vast consortium of symbiotic bacteria" and all their surface glycans, while distinguishing them from pathogens (Cash et al. 2006
| Adaptation by glycan loss |
|---|
A drastic mechanism for hosts to alter the glycan composition of their cell surfaces is to abolish the expression of a terminal glycan structure in order to curtail pathogen interaction. The complete loss of a particular glycan usually involves inactivating mutations of one or more genes involved in assembly followed by the fixation of the inactive allele across the population. Fixation of such mutations can come about due to selection for absence of the glycan or by genetic drift due to small population size (founder events or demographic bottlenecks). The complete loss a glycan modification, which is otherwise very common in many closely related lineages (e.g., alpha-Gal in Catarrhines or Neu5Gc in humans) has at least two advantages: (i) the loss quickly prevents recognition by pathogens using structure as a receptor and (ii) it opens the possibility of adding the abolished glycan to the panel of nonself-glycans recognized by adaptive immunity. For example, in the human and other primate blood groups, the absence of a glycan type is also accompanied by the presence of antibody against the missing glycan (Clausen and Hakomori 1989
Of course, there is a potential cost to such an adaptive glycan loss. If the nonfunctional allele responsible for the loss becomes fixed in the population, the lost glycan will likely be lost forever, as random mutations are much more likely to further incapacitate a gene rather than to revive its function. A further cost will result when a glycan with important endogenous functions is lost (e.g., due to very strong negative selection by a pathogen), as this will require subsequent compensatory changes in the endogenous lectins. It follows that the set of endogenous lectins of each lineage can be expected to closely mirror that lineage's glycan repertoire as far as endogenous function is concerned. The human specific changes in several siglec genes might be an example for such compensatory changes, as humans have lost the ability to make Neu5Gc and some of their sialic-acid-binding siglecs have shifted from binding both Neu5Gc and Neu5Ac to a strong preference for binding Neu5Ac (Brinkman-Van der Linden et al. 2000
). The potential costs associated with such radical glycan remodeling are illustrated by the many different forms of congenital disorders of glycosylation involving deficiencies in N-glycan synthesis (even if each particular form is rare) (Aebi and Hennet 2001
). It has been suggested that selection for altered levels of N-glycan synthesis could be linked to an inhibitory effect on viral replication (Freeze and Westphal 2001
). Most animal populations are likel
