Glycobiology Advance Access originally published online on May 5, 2006
Glycobiology 2006 16(8):736-747; doi:10.1093/glycob/cwj124
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular evolution of protein O-fucosyltransferase genes and splice variants
2 INRA, UMR 1061, Unité de Génétique Moléculaire Animale, GDR-CNRS 2590, Université de Limoges, Institut des Sciences de la Vie et de la Santé, Faculté des Sciences et Techniques, 87060 Limoges, France; and 3 Department of Biochemistry and Cell Biology, Institute for Cell and Developmental Biology, State University of New York, Stony Brook, NY 11794
1 To whom correspondence should be addressed; e-mail: maftah{at}unilim.fr
Received on February 3, 2006; revised on April 12, 2006; accepted on April 26, 2006
| Abstract |
|---|
|
|
|---|
O-Fucose has been described on both epidermal growth factor-like (EGF-like) repeats and Thrombospondin type 1 repeats (TSRs). The enzyme adding fucose to EGF-like repeats, protein O-fucosyltransferase 1 (Pofut1), is a soluble protein located in the lumen of endoplasmic reticulum (ER). A second protein O-fucosyltransferase, Pofut2, quite divergent from its homolog Pofut1, has recently been shown to O-fucosylate TSRs but not EGF-like repeats. To date, Pofut1 genes have only been characterized in human, mouse, and fly, and Pofut2 in mouse, fly, and partially in the nematode Caenorhabditis elegans. Here, we report cDNA sequences and genomic structures of bovine Pofut1 and Pofut2 genes and describe for the first time five alternative spliced transcripts for each gene. Only one transcript for both Pofut1 and Pofut2 encodes an active bovine O-fucosyltransferase. Variant transcript distribution was examined in 13 bovine tissues. Transcripts encoding active forms are ubiquitous, whereas other forms possess a more restricted tissue-expression profile. Sequence comparison and phylogenetic analyses revealed that both Pofut genes are present as a single copy in animal genomes, and their exonintron organizations are conserved among vertebrates. The last common ancestor of all analyzed bilaterian species would be predicted to possess polyexonic Pofut genes in their genome.
Key words: alternative splicing / fucose / gene organization / O-fucosyltransferase activity / Pofut1 and Pofut2 phylogeny
| Introduction |
|---|
|
|
|---|
Fucose is commonly found as a terminal sugar residue at the nonreducing end of glycans in glycoproteins and glycosphingolipids of eukaryotic cells (Lochnit et al., 2001
2,3- or
2,6-sialyltransferase (Moloney, Panin, et al., 2000
2,3/6-Gal-ß1,4-GlcNAc-ß1,3-Fuc-O-S/T (Figure 1A). O-Fucosylglycans modulate the function of proteins by controlling their interactions. O-Fucosylation of EGF-like repeats is essential for the interaction of Notch receptor with its ligands, Delta and Serrate/Jagged (Okajima and Irvine, 2002
|
Recently, O-fucose was found in a different protein context: Thrombospondin type 1 repeats (TSRs) (Figure 1B). TSR O-fucosylation was first described on all three TSRs within Thrombospondin-I (Hofsteenge et al., 2001
Both Pofut1 and Pofut2 genes are present as a single copy in all completely sequenced animal genomes. In silico searching of genomic and expressed sequence tag (EST) databases revealed that Pofut genes are widely distributed in bilaterians. Surprisingly, some protozoan parasites such as the apicomplexan Plasmodium falciparum possess a Pofut2 gene (Martinez-Duncker et al., 2003
). Pofut1 genes have only been characterized in human, mouse, and fly, and Pofut2 gene in fly, mouse, and partially in the nematode C. elegans. Moreover, little is known about the transcript variants and the expression profiles of these two genes. Here, we report the cDNA sequences and genomic structures of bovine Pofut1 and Pofut2 genes. We describe for the first time five alternatively spliced transcripts for each Pofut genes, determine their expression profiles in 13 bovine tissues, and characterize the activities of the encoded forms. By sequence comparison and phylogenetic analyses, we show that both genes are well conserved among animals, supporting the classical relationships between animal clades. The comparison of exonintron organizations led to the proposal that the ancestor of bilaterians would have already possessed Pofut genes with several coding exons.
| Results |
|---|
|
|
|---|
Genomic organization of bovine Pofut1 and Pofut2 genes
Homology searches against bovine EST database using human and mouse Pofut1 and Pofut2 coding sequences (CDSs) as queries revealed the existence of Pofut orthologous sequences. These bovine ESTs allowed the design of primers for cDNA isolation from bovine tissues (Table 1 in Supplementary Data). The sequencing of full-length cDNA and comparison with human and mouse genomic sequences were used to design specific primers for the determination of exonintron boundaries of the bovine genes (Table 2 in Supplementary Data). Bovine Pofut1 consists of at least 10 exons encompassing 39.4 kb, and bovine Pofut2 consists of nine exons encompassing 13.9 kb (Figure 2). Coding exons range in size from 90 to 803 bp for Pofut1 and from 67 to 1050 bp for Pofut2. The size of introns varies from 1153 bp to at least 16,097 bp for bovine Pofut1 and from 215 bp to at least 5193 bp for bovine Pofut2. Large introns were not entirely sequenced; they were amplified by polymerase chain reaction (PCR), and their sizes were estimated by agarose gel electrophoresis and then compared with sizes deduced from the partially sequenced cow genome (www.ncbi.nlm.nih.gov/Genomes/). Exonintron boundary sequences follow the "GT and AG" rule. The exonintron organization of bovine Pofut1 and Pofut2 genes resembles those of human and mouse genes. However, exons 5, 9, and 10 of bovine Pofut1 have never been described before. Southern blot experiments with an exon 7 probe for Pofut1 and an exon 2 probe for Pofut2 were carried out with digested bovine genomic DNA extracted from whole blood (Figure 3). For each of the three performed digestions, only one band was detected for each gene, suggesting that only one Pofut1 gene and one Pofut2 gene are present in the bovine genome.
|
|
Detection of alternatively spliced Pofut1 and Pofut2 transcripts in bovine tissues
Reverse transcriptase (RT)PCR analyses with specific bovine Pofut1 primers designed in exon 1 and in exon 8 or 10 (Table 2 in Supplementary Data) revealed at least five different bovine Pofut1 transcript variants (Figure 4A) named Pofut1a to Pofut1e. Pofut1a (GenBank accession number AY344580
[GenBank]
) corresponds to the transcription of exons 14 and 68. It encodes the putatively active enzyme Pofut1 with a predicted sequence of 391 amino acids. Pofut1b (AY344581
[GenBank]
) results from the splicing of exons 5 and 8, and its translation would produce a protein of 351 amino acids. The three other variants lack most of the last exons. Indeed, in Pofut1c (DQ138291
[GenBank]
), exons 6, 7, 9, and 10 are spliced. In that case, the transcription of exon 5 generates a premature stop codon. Pofut1d (DQ138292
[GenBank]
) contains exons 14 and 8, whereas Pofut1e (DQ138293
[GenBank]
) comprises exons 14, 9, and 10.
|
In the same way, RTPCR with specific Pofut2 primers revealed five different transcripts in bovine tissues (Pofut2a to Pofut2e) (Figure 4B). Pofut2a (DQ138294
[GenBank]
) results from the use of a splice site 90 nucleotides within exon 8 and encodes the putatively active bovine Pofut2. Indeed, it has a predicted size of 429 amino acids, consistent with the human and mouse orthologs (Luo, Koles, et al., 2006
). Pofut2b (AY344582
[GenBank]
) corresponds to the transcription of all exons and would encode a protein with a predicted sequence of 459 amino acids. Pofut2c (DQ138295
[GenBank]
), Pofut2d (DQ138296
[GenBank]
), and Pofut2e (DQ138297
[GenBank]
) result from alternative splicings involving exons 4, 7, and 8. Each of these variants would encode the same putative protein of 186 amino acids because of a frame shift in exon 4 and, consequently, the occurrence of a premature stop codon in exon 5.
To determine the transcription start site for the Pofut genes and to obtain further insights into the 5' untranslated region (UTR), we performed 5' rapid amplification of cDNA end (5'-RACE) experiments, using heart mRNA. Nested PCR amplification and sequencing allowed the identification of a 68-base-long Pofut1 5' UTR. cDNA sequences of Pofut1a, Pofut1c, and Pofut1d have the same 3' UTR with a putative partial polyadenylation signal (ATAAA) at positions +1840, +1554, and +1404 bases, respectively. Pofut1b and Pofut1e possess a different 3' UTR, due to the presence of exons 9 and 10, with a putative polyadenylation signal (AATAAA) at positions +1424 and +988 bases, respectively. In summary, total Pofut1 transcript sizes range from 1009 to 1858 bases. The 5'-RACE experiment on Pofut2 was not conclusive. Instead, we proceeded to direct sequencing of a bacterial artificial chromosome (BAC) containing the bovine Pofut2 gene. A putative 131-base-long 5' UTR is predicted using the Dragon Promoter Finder online software (http://research.i2r.a-star.edu.sg/promoter/promoter1_5/DPF.htm). cDNA sequences revealed that all Pofut2 transcript variants (Pofut2a to Pofut2e) possess the same 3' UTR with a putative polyadenylation signal (AATAAA) at positions +2300, +2390, +2329, +2239, and +2074 bases, respectively. Pofut2 transcripts are longer than those of Pofut1, with lengths ranging from 2091 to 2407 bases.
RTPCR was also performed on 13 adult bovine tissue samples to determine the expression profiles of Pofut1 and Pofut2 complete CDS. The results of such experiments are shown for Pofut1 variants in Figure 5A. All analyzed bovine tissues present one to five different Pofut1 and Pofut2 transcript variants (Figure 5B). Pofut1a and Pofut2a were detected in all examined tissues, consistent with the widespread distribution of O-fucose and the fact that these transcripts may encode active forms of Pofuts. Pofut1b is only present in brain and longissimus thoraci muscle, and Pofut1c in liver. The shortest Pofut1 transcripts exist in six tissues for Pofut1d and in eight for Pofut1e. Pofut2b to Pofut2e were detected in all examined bovine tissues except in rectus abdominis muscle and spleen for Pofut2c. In all adult bovine tissues tested, Pofut1 and Pofut2 genes appear to be weakly expressed. Indeed, nested PCR had to be used to detect the presence of transcript variants.
|
Sequence characteristics and enzyme activities of bovine O-fucosyltransferases
Protein encoded by Pofut1a (Pofut1A) possesses a potential N-terminal signal peptide (M1 to V29), two putative N-glycosylation sites (N65 and N163), the three conserved peptide motifs common to
1,2- and
1,6-fucosyltransferases (P236W248, S308E321, and I342E368) (Martinez-Duncker et al., 2003
), a DXD motif (E368 R369 D370) that is believed to be critical for the proper function of many glycosyltransferases (Munro and Freeman, 2000
), and a KDEL-like motif for ER-resident proteins (the last four amino acids: RDEF). Pofut1B lacks the third conserved peptide motif, the DXD, and the KDEL-like motifs. Pofut1C, Pofut1D, and Pofut1E with predicted sequences of 229, 184, and 185 amino acids, respectively, are largely truncated and lack all the conserved peptide motifs.
To check whether bovine Pofut1a produces an active enzyme, we assessed the activity of Pofut1A using a bacterially expressed EGF-like repeat from human factor IX as described by Rampal et al. (2005)
. To confirm the expression of the recombinant enzyme, we performed western blot analysis using rabbit polyclonal antibodies raised against bovine Pofut1. A protein of 45 kDa, which corresponds to the predicted size of glycosylated Pofut1A, was strongly detected in extracts from COS-1 cells transfected with the Pofut1a CDS construct (Figure 6A, lane 2). Comparatively, endogenous Pofut1 was only weakly detected (Figure 6A, lane 1). Mock-transfected COS-1 cells revealed an endogenous activity of 4.5 nmol/h/mg of total proteins, whereas an activity of 53.6 nmol/h/mg was measured after transfection by the bovine Pofut1a CDS construct (Figure 6A), around 12 times higher than the control. The recombinant proteins encoded by variants Pofut1b to Pofut1e, although detected on western blots for Pofut1B and Pofut1C (data not shown), are not able to catalyze the transfer of fucose on EGF-like repeat (data not shown).
|
Pofut2A possesses a potential N-terminal signal peptide (M1 to Q30), three putative N-glycosylation sites (N189, N209, and N259), the three conserved peptide domains (P287I299, K327E340, and I369E396), and the DXD motif (E396 R397 E398). No KDEL-like motif was found. Compared to this sequence, Pofut2B shows an insertion of 30 amino acids in the second conserved peptide motif, whereas Pofut2C, Pofut2D, and Pofut2E, all of which encode the same protein, are largely truncated and lack all the conserved peptide motifs.
Pofut2 activity was assayed using a bacterially expressed TSR from human Thrombospondin-I as acceptor substrate (Luo, Nita-Lazar, et al., 2006
) (Figure 6B). The extracts of mock-transfected COS-1 cells had an endogenous activity of 12.4 nmol/h/mg. After transfection with the bovine Pofut2a CDS construct, an activity of 16.2 nmol/h/mg was measured, increased by 30% compared with that of the control. Similar results were seen when expressing recombinant mouse Pofut2 (Dlugosz, M.A. and Haltiwanger, R.S., in preparation). Using the same method, we also assayed the activities of the other putative proteins encoded by variants Pofut2b to Pofut2e. Only Pofut2A was able to cause a statistically significant increase in the transfer of fucose to TSR over endogenous activity (Figure 6B). Immunoblot analysis showed that antibovine Pofut2 antibody cross-reacts with the endogenous Pofut2 in COS-1 cells (Figure 6B, lane 1). Reactivity toward a 50-kDa protein, consistent with the predicted size of glycosylated Pofut2, increased with COS-1 cells transfected with the Pofut2a CDS construct (Figure 6B, lane 2). Among the proteins encoded by variants Pofut2b to Pofut2e, only Pofut2B was detected in transfected COS-1 cells (data not shown).
Phylogeny of Pofut1 and Pofut2 enzymes
Database searches allowed the selection of complete sequences (except for Aedes aegypti and Onchocerca volvulus Pofut1, and Oryzias latipes and Takifugu rubripes Pofut2) from a panel of species representing most of the animal kingdom diversity. For the two enzymes, only one copy of the gene is present in all completely sequenced genomes. Twenty-seven protein sequences corresponding to Pofut1 orthologs and 21 sequences corresponding to Pofut2 orthologs were selected and aligned. Protein sizes vary from 380 to 404 amino acids for Pofut1 and from 405 to 495 amino acids for Pofut2. The main variation resides in the N-terminal part of the enzymes, which includes the signal peptide. Protein maximum likelihood (ML) analysis with bootstrap replicates (BP) and a Bayesian phylogenetic search with posterior probabilities (PPs) were carried out. The classical phylogenetic relationships between animal clades were recovered for both enzymes (Figure 7). Mammals, actinopterygians, deuterostomes, insects (only for Pofut1), and nematodes form monophyletic groups with BP ranging from 80 to 100% and PP of 1 (with the exception of deuterostomes for Pofut2). Bos taurus is the sister group of Sus scrofa for Pofut1 (BP = 100%, PP = 0.96) and of primates for Pofut2 (BP = 79%, PP = 1). Trees are mostly congruent between each other, except for the position of Xenopus sequences. In the ML Pofut1 tree, they are placed after the emergence of the sea urchin Strongylocentrotus purpuratus and before the fishes. However, the S. purpuratus sequence seems misplaced according to the classical view of animal phylogeny, emerging after Ciona sequences, but with low supports (BP = 52%, PP < 0.5). This surprising position is not recovered in the Bayesian approach where it emerges before Ciona (PP = 0.93). The incorrect position of Xenopus sequences in ML tree could then be explained by the long-branch attraction phenomenon (Philippe, Zhou, et al., 2005
), due to the high degree of divergence of S. purpuratus sequence. Moreover, Bombyx mori does not form a monophyletic group with the other insects in the ML Pofut2 tree. However, the Bombyx position is only supported with a low BP of 16%, whereas it is clustered with its counterparts in the Bayesian tree with PP = 0.52. In the second most likelihood tree (lnL difference of 0.6), insects are monophyletic (data not shown). The incorrect position of B. mori in the ML tree could be attributed to a specific deletion of 20 amino acids in the portion selected for the alignment. At present, the Bombyx sequence is the shortest complete Pofut2. It is worth noting that sequences from some protozoan species can be confidently aligned with Pofut2, although they are divergent. All belong to the phylum Apicomplexa and are intracellular parasites of vertebrates. No Pofut1 was found in these species. Only Cryptosporidium Pofut2 were included in our analyses because Plasmodium berghei, P. falciparum, and Plasmodium vivax sequences would introduce some noise in the phylogenetic reconstruction because of their high levels of divergence. Both species of Cryptosporidium are sister group in the Pofut2 tree and present the longest branches.
|
| Discussion |
|---|
|
|
|---|
In this article, our work focused on the genomic organization, identification, and distribution of alternative spliced variants, enzymatic activity, and phylogeny of O-fucosyltransferases, in the context of the nearly completely sequenced genome of the animal model B. taurus.
In silico analyses revealed that the bovine Pofut1 gene is located on chromosome 13 close to pleiomorphic adenoma-like gene 2 (Plagl2), underlying a conserved synteny with human (chromosome 20q11) and mouse (chromosome 2H2). The exonintron organization of bovine Pofut1 gene is slightly different from those of human and mouse orthologous genes. Bovine Pofut1 gene is longer and contains two additional exons (exons 9 and 10) located at least 16 kb downstream the exon homolog to the last one in human and mouse genes. No homology with bovine exons 9 and 10 was detected in sequences of the large diversity of species studied in this article. Compared with mouse, bovine and human Pofut1 show an additional exon at position 5, but the sequences of both exons are not homologous. However, a homologous sequence of bovine exon 5 is found in human intron 6. Moreover, the transcription of exon 5 in both cases generates a premature stop codon, and if the translation occurs, a truncated inactive enzyme will be produced. In B. taurus, this transcript is tissue specific, restricted to the liver, whereas in Homo sapiens, the homologous transcript, named variant 2 (BC000582 [GenBank] ), was found in brain.
The active Pofut1 enzyme is encoded in B. taurus by the transcript variant Pofut1a that includes exons 14 and 68. In mammals, birds, fishes, and amphibians, the six intron positions are strictly conserved (Figure 7A). Variations observed in protein sizes only result from variation in exon 1 length. These small differences at the N-terminus part of Pofut1, in the portion corresponding to the signal peptide, explain the size distribution from 380 amino acids for Xenopus and Gallus gallus to 395 amino acids for Rattus norvegicus, Danio rerio, and O. latipes. This six-intron organization is also recovered in the sea urchin S. purpuratus (data not shown) and in the nematode Caenorhabditis. In the first, compared with vertebrates, exon size variations were observed, but nearly all intron positions are conserved with only a small shift in the position of intron 3. In the second, only positions of introns 1 and 5 are strictly conserved. The same situation is present in the honeybee Apis mellifera, although the sequence is 3' partial (so not included in our analysis). However, exon organization is different among insect species. Drosophila shows two exons, whereas B. mori presents four, but all insect Pofut1 genes conserve the position of intron 1. Therefore, the last common ancestor of all analyzed bilaterian species would already possess a Pofut1 gene with at least two exons and one intron at position 1. A broader panel, especially of protostome Pofut1 genes, is needed to settle this conclusively and to demonstrate whether intron 5 was also present in the ancestral Pofut1 gene and secondarily lost in Drosophila and B. mori. Alternatively, the common position of intron 5 in A. mellifera, nematodes, and deuterostomes could result from a convergent acquisition. A monoexonic gene in Ciona encodes the putatively active Pofut1. These species show rather derived features that may reflect adaptation to their ecological niche, particularly the sessile adult stage (Hughes and Friedman, 2005
), and possess relatively compact gene organizationsapproximately one gene per 7.5 kb of sequence in Ciona intestinalis (around 16,000 genes) compared to approximately one gene per 100 kb in the human (around 26,500 genes). A parsimonious explanation is to consider that Pofut1 genes in Ciona have secondarily lost their introns.
The exonintron organization of the bovine Pofut2 gene also resembles those of human and mouse homologs, however, with smaller introns. Human and mouse Pofut2 genes are located on chromosomes 21 and 10, respectively. At present, no information on bovine Pofut2 gene location is available. Genomic organizations of Pofut2 genes in mammals, birds, amphibians, and fishes are the same, with eight conserved intron positions (Figure 7B). Size variations only result from differences in exon 1 length. Introns at positions 2 and 3 are conserved in Caenorhabditis, whereas no intron is present in Ciona, Drosophila, and B. mori Pofut2 genes. Based on the ML tree, nematodes and deuterostomes would share a common ancestor containing in its genome a Pofut2 gene with at least three coding exons with introns 2 and 3 in conserved positions. However, the Bayesian tree does not support this group (PP < 0.5). If we admit that protostomes are a monophyletic group (Philippe, Lartillot, et al., 2005
) and are artifactually split in our tree, then the most parsimonious hypothesis is to consider that the last common ancestor of bilaterians already possesses at least the three-exon architecture. The introns were lost in the insect lineage and for Ciona, as found for Ciona Pofut1.
Interestingly, Pofut2 genes exist in two genera of apicomplexan parasites, Cryptosporidium and Plasmodium, where they correspond to monoexonic and biexonic genes, respectively. This intron position is not shared by any other Pofut2 genes sequenced to date. As Cryptosporidium is an early emerging genus among the phylum Apicomplexa (Zhu et al., 2000
; Bankier et al., 2003
), it could be hypothesized that the intron appeared in the common ancestor of Plasmodium species, which emerged later. Another plausible explanation comes from the observation that the number of predicted introns in Cryptosporidium (<10% of genes) is lower than that in Plasmodium (>50%), which had to be linked with the loss of spliceosomal machinery components in Cryptosporidium (Templeton et al., 2004
). Therefore, the reduction in the number of introns would be then considered as a lineage-specific evolutionary phenomenon. The Pofut2 gene in apicomplexans was probably acquired by lateral gene transfer from one host before the speciation of Cryptosporidium and Plasmodium. However, at present, there is no evidence of Pofut2 in the genome of other apicomplexans such as Toxoplasma gondii. The presence of a potential O-fucosyltransferase could be related to the existence of numerous proteins containing conserved adhesion domains, such as EGF-like repeats and TSRs (Templeton et al., 2004
).
Tissue distributions of human and mouse Pofut1 transcripts have only been studied by northern blots (Wang et al., 2001
; Shi and Stanley, 2003
). The authors showed that many Pofut1 splice variants exist in various tissues, but only the transcript encoding the putative active O-fucosyltransferase was described. Human variant 1 (exons 14 and 68) and the mouse transcript, encoding active forms of Pofut1 consisting of 393 and 388 amino acids, respectively, have been detected in all examined tissues (including heart, brain, liver, skeletal muscle, and kidney), consistent with the widespread localization of O-fucose (Wang et al., 1996
, 2001
; Wang and Spellman, 1998
). In mice, the Pofut1 gene is increasingly expressed during embryonic development (Shi and Stanley, 2003
). In humans, the transcript is 5.2 kb long (Wang et al., 2001
), comparable to the estimated size in mice (Shi and Stanley, 2003
). Moreover, the transcript contains an extensive 3' UTR (around 4 kb), typical of many glycosyltransferases (Breton and Imberty, 1999
). In this study, we demonstrated using RTPCR that bovine tissues contain at least one to five different transcript variants. The longest bovine transcript, Pofut1a, is 1858 bases long with a 3' UTR of only 631 bases, perhaps revealing a different Pofut1 gene expression control mechanism in cattle. Pofut1a encodes the active enzyme and is detected in the 13 examined adult tissues. Interestingly, the shortest Pofut1 variants (Pofut1b to Pofut1e), which result from different alternative splicings, were not detected in all tissues. They could be specific to B. taurus because none of these variants are found in other animal EST databases.
Five bovine Pofut2 transcripts (Pofut2a to Pofut2e) were identified and are present in nearly all examined adult bovine tissues. Among them, only bovine Pofut2a transcript encodes the active enzyme and is clearly orthologous to the mouse transcript (Luo, Koles, et al., 2006
). No accurate experiment was conducted on the tissue distribution of Pofut2 variants in other species. However, searches in animal EST databases failed to detect variants homologous to Pofut2b to Pofut2e. Only two human ESTs (BP318733
[GenBank]
and DB158060
[GenBank]
) from pericardium and thymus would encode a putative protein of 186 amino acids, similar to bovine Pofut2C, Pofut2D, and Pofut2E. Database searches also show that the majority of proteins containing TSRs possess the consensus TSR O-fucosylation site (Hofsteenge et al., 2001
), indicating that this modification could be common. This is in agreement with the large distribution of Pofut2a transcript in bovine tissues and with the presence of the corresponding gene in a wide variety of organisms.
Bovine Pofut1 has a signal peptide and a KDEL-type sequence (RDEF). Therefore, as demonstrated for the human enzyme (Luo and Haltiwanger, 2005
), the bovine Pofut1 could be a soluble protein located in the lumen of the ER. Drosophila Ofut2 also appears to be localized to the ER (Luo, Koles, et al., 2006
) even though the enzyme does not possess an ER retention signal. An ER localization is also predicted for Pofut1 and Pofut2 of all the animal species studied in this article. Thus, as Pofut1 and Pofut2 only act on properly folded domains in the ER, O-fucosyltransferases may constitute a new class of quality-control proteins (Luo and Haltiwanger, 2005
; Luo, Koles, et al., 2006
).
| Materials and Methods |
|---|
|
|
|---|
Identification of bovine Pofut1 and Pofut2 cDNA sequences and genomic structures
Searches of bovine EST databases were performed using the Basic Local Alignment Tool (BLAST) at the National Centre for Biotechnology Information (NCBI, at www.ncbi.nlm.nih.gov/BLAST/) with human and mouse Pofut CDS as queries. Two bovine Pofut1 ESTs were foundBF655050 [GenBank] from a pooled tissue database (marrow, alveolar macrophage, ovary, fetal semitendinosus muscle, and fetal longissimus muscle) and AW415368 [GenBank] from another pooled tissue database (lymph node, ovary, fat, hypothalamus, and pituitary). One Pofut2 EST, accession number AW669493 [GenBank] , originating from the latter database, was also identified. The full-length ESTs, cloned in the pCMV SPORT6 vector, were obtained from Institut National de la Recherche Agronomique (INRA, Centre de Ressources Biologiques en Génomique des Animaux Domestiques et dIntérêt Economique, Jouy-en-Josas, France) and sequenced. Comparison with human and mouse genomic organizations using the Spidey mRNA-to-genomic alignment program (www.ncbi.nlm.nih.gov/spidey/) allowed the design of specific primers in the exon borders flanking intronic regions (Table 2 in Supplementary Data). These primers were used for direct sequencing of genomic DNA extracted from bovine Charolais whole blood (QIAmp Blood Kit, Qiagen Inc., Hilden, Germany). Exons and introns were amplified by PCR, allowing the determination of bovine Pofut gene structures. All exons and small introns were completely sequenced. When possible, the sizes of large introns were estimated, after PCR amplification, by agarose gel electrophoresis. The increasing effort in the cow genome sequencing and assembling by the international consortium allowed us to confirm some sequences and intronic sizes, but some information is still missing for these two genes. One bovine BAC, clone 0669H8, containing the appropriate part of the bovine Pofut2 gene was used for direct sequencing (Jeon et al., 2001
Southern blot analysis of bovine genomic DNA
Bovine genomic DNA was prepared from blood sample (QIAmp Blood Kit). Ten micrograms of the DNA was subjected to digestion with EcoRI, HindIII, or NdeI endonucleases, and the fragments were separated in 0.8% (w/v) agarose gel. DNA was depurinated for 20 min with 0.25 N HCl, denaturated for 30 min with 0.4 N NaOH, and transferred onto a Hybond-N1 membrane (Amersham Pharmacia Biotech, Europe GmbH, Orsay, France). Hybridizations were carried out with an exon 7 probe for Pofut1 and an exon 2 probe for Pofut2, generated by PCR amplification of genomic DNA using specific primers: 5'-ACGCGTGTGCCATGCTGAAAG-3' forward and 5'-AGAGTCGGTGGCAATGTAGAC-3' reverse primers for Pofut1, and 5'-CCTTCTGTACGATGTCAATCC-3' forward and 5'-GCGATGAACTGCTCGTACTC-3' reverse primers for Pofut2. Twenty-five nanograms of probes was labeled with [
-32P]dCTP using a random priming kit (Invitrogen, Carlsbad, CA). Labeling was performed overnight at 65°C in a buffer containing 10% (w/v) dextran sulfate, 1% (w/v) sodium dodecyl sulfate (SDS), 0.5 M NaCl, and 100 µg of sheared salmon sperm DNA. Blots were washed two times for 15 min each at 65°C with 2x sodium chloridesodium citrate buffer (SSC), two times with 1x SSC, and one time with 1x SSC/0.1% SDS, then analyzed after exposure to X-ray films (Kodak, Kyoto, Japan) for 3 days at 80°C.
Total RNA extraction and RTPCR analysis
Bovine tissues had been sampled on a 15-month-old Charolais at the INRA Centre of Clermont-Ferrand Theix (France). Total RNAs from various bovine tissues were isolated with FastRNA Pro Green Kit (Q-BIOgene, Illkirch, France) according to the manufacturers instructions, and first-strand cDNA was synthesized from 2 µg of total RNA templates with the SuperScript First Strand Synthesis System for RTPCR (Invitrogen). Potential transcripts were investigated by designing primers from either side of the CDS (Table 2 in Supplementary Data). The 50-µL PCR reactions were run with 2-µL RT products, 0.3 pM sense and antisense primers (MWG, High Point, NC), 0.2 mM dNTP, 2 mM MgCl2, 1x PCR buffer, and 0.5 U of Taq polymerase (Interchim, Montluçon, France). Amplification of the target cDNA was performed in a Primus 96 plus thermocycler (MWG). After 3 min denaturation at 96°C, 35 PCR cycles were carried out as follows: 30 s at 94°C, 45 s at 60°C, 2 min at 72°C, and a final extension of 5 min at 72°C. Reaction products were analyzed by electrophoresis on 1% agarose gel and compared with 1Kb+ DNA Ladder (Invitrogen). The RTPCR products were cloned in the pGEMT-Easy plasmid (Promega, Madison, WI) for entire sequencing using T7 and SP6+ vector primers and internal primers. A dye-labeling chemistry kit (PRISMReady Reaction Ampli Taq FS, Applied Biosystems, Norwalk, CT) and the ABI Prism 310 Genetic Analyzer (Applied Biosystems, Norwalk, CT) were used for sequencing.
5' Rapid amplification of cDNA end (5'-RACE)
To obtain 5'-cDNA ends, we performed RACE experiments on 1 µg of total RNA extracted from bovine heart, using the SMART RACE cDNA Amplification kit (Ozyme, Montigny le Bretonneux, France) according to the manufacturers instructions. 5' UTR of Pofut1 transcript was amplified by two rounds of PCR combining each time-specific and adaptor primers: Pofut15'RACE (5'-ATCCACTGGTCTTTGTAGGAGGCACTG-3')/universal primer mix (UPM) and Pofut15'NRACE (5'-TCAGCTGCGGGATGAATTCTG-3')/nested universal primer (NUP). First- and second-round PCRs were carried out in a 50-µL reaction volume containing 0.3 pM sense and antisense primers, 0.2 mM dNTP, 2 mM MgCl2, 1x PCR buffer, and 0.5 U of Taq polymerase in the following cycling conditions: initial denaturation at 94°C for 3 min followed by 35 cycles (94°C for 30 s, 65°C for 45 s, and 72°C for 2 min) and one last elongation step (72°C for 10 min). PCR products were cloned into the pCR2.1 vector (Invitrogen) and sequenced.
Expression of bovine O-fucosyltransferases in COS-1 cells
Pofut1 and Pofut2 variants were cloned into the pcDNA/TOPO3.1 expression vector (Invitrogen). Highly pure recombinant plasmids were obtained by anion-exchange chromatography (plasmid midi Kit, Qiagen, Les Ulis, France). They were used to transiently transfect COS-1 cells using Fugene 6 Transfection Reagent (Roche, Mannheim, Germany). After 48 h of transfection, proteins were extracted in a lysis buffer (1% [v/v] Triton X-100, 10 mM sodium cacodylate (pH 6), 20% [v/v] glycerol and 1 mM dithiothreitol [DTT]) for 2 h at 4°C. Protein concentration was determined with bovine serum albumin (Bio-Rad, Marne-la-Coquette, France) as a standard (Bradford, 1976
).
Western blot analyses
Polyclonal antibodies were obtained by immunization of rabbits with recombinant bovine Pofut1 or Pofut2. Enzymes were produced in Escherichia coli BL21(DE3) cytoplasm associated with an NH2-(His)6-Tag and were purified by affinity chromatography on Ni-NTA-agarose (Qiagen, Les Ulis, France). Subsequently, IgG was purified from serum using protein G (Agro-Bio, La Ferté St. Aubin, France). Western blot analyses were performed mainly as described (Dupuy et al., 2002
) and bacterially expressed bovine Pofuts were used as controls.
O-Fucosyltransferase assay
O-Fucosyltransferase assays were performed essentially as described (Wang and Spellman, 1998
; Luo, Nita-Lazar, et al., 2006
). The reaction mixture (50 µL) contains 0.1 M imidazoleHCl (pH 7.0), 50 mM MnCl2 (only for O-fucosyltransferase 1 assay), 0.1 mM of GDP-[14C]fucose (40008000 cpm/nmol, Perkin-Elmer, Norwalk, CT), 20 µM of acceptor substrate (recombinant human factor IX EGF-like repeat for Pofut1 or recombinant human Thrombospondin-I TSR3 for Pofut2) and 5 or 10 µg of total COS-1 cells protein extract for Pofut1 or Pofut2, respectively. The mixture was incubated at 37°C for 7.5 min for Pofut1 and 3 h for Pofut2. The reaction was stopped on ice and then diluted with 950 µL of 0.25 M ethylenediamine tetraacetic acid (EDTA), pH 8.0. Samples were loaded onto a Sep-Pak C18 cartridge (Waters, Saint-Quentin, France). The cartridge was washed with 10 mL of H2O, and the product was then eluted directly into scintillation vial with 3 mL of 80% acetonitrile containing 0.05% trifluoroacetic acid. Ten milliliters of Biodegradable Counting Scintillant (Amersham Pharmacia Biotech) was added, and radioactivity was counted using a liquid scintillation beta counter (liquid scintillation analyzer, Tri-Carb-2100TR, Packard, Meriden, CT).
Phylogenetic analyses
Pofut1 and Pofut2 orthologs were retrieved from various databases using BLAST. Their accession numbers in Entrez protein database are A. aegypti Pofut1 ABA29465
[GenBank]
, Anopheles gambiae Pofut1 ABA29466
[GenBank]
and Pofut2 ABA29475
[GenBank]
, B. mori Pofut1 ABA29467
[GenBank]
and Pofut2 ABA29474
[GenBank]
, B. taurus Pofut1 AAQ02332
[GenBank]
and Pofut2 CAE02608
[GenBank]
, Canis familiaris Pofut1 ABA29461
[GenBank]
, Caenorhabditis briggsae Pofut1 ABA29470
[GenBank]
and Pofut2 CAH03732
[GenBank]
, C. elegans Pofut1 ABA29469
[GenBank]
and Pofut2 Q8WR51, C. intestinalis Pofut1 AK112708
[GenBank]
and Pofut2 CAE02609
[GenBank]
, Ciona savignyi Pofut1 CAH03713
[GenBank]
and Pofut2 ABA29472
[GenBank]
, Cryptosporidium hominis Pofut2 EAL37907
[GenBank]
, Cryptosporidium parvum Pofut2 EAK88374
[GenBank]
, D. rerio Pofut1 NP_991281
[GenBank]
and Pofut2 ABA29477
[GenBank]
, Drosophila melanogaster Ofut1 Q9V6X7 and Ofut2 AAK77300
[GenBank]
, Drosophila pseudoobscura Ofut1 DQ139947
[GenBank]
and Ofut2 ABA29473
[GenBank]
, Drosophila yakuba Ofut1 CAH40834
[GenBank]
and Ofut2 CAH41976
[GenBank]
, G. gallus Pofut1 NP_989758
[GenBank]
and Pofut2 XP_421892
[GenBank]
, H. sapiens POFUT1 NP_056167
[GenBank]
and POFUT2 NP_598368
[GenBank]
, Mus musculus Pofut1 NP_536711
[GenBank]
and Pofut2 NP_084538
[GenBank]
, O. volvulus Pofut1 CAH40836
[GenBank]
, O. latipes Pofut1 ABA29462
[GenBank]
and Pofut2 ABA29476
[GenBank]
, Pan troglodytes Pofut1 CAH03712
[GenBank]
and Pofut2 NP_001008983, Pongo pygmaeus Pofut1 CAH91412
[GenBank]
, R. norvegicus Pofut1 NP_001002278 and Pofut2 XP_228073
[GenBank]
, S. purpuratus Pofut1 ABA29464
[GenBank]
, S. scrofa Pofut1 ABA29460
[GenBank]
, T. rubripes Pofut1 CAE54305
[GenBank]
and Pofut2 AJ781759
[GenBank]
, Tetraodon nigroviridis Pofut1 ABA29463
[GenBank]
, Xenopus laevis Pofut1 CAD55833
[GenBank]
, Xenopus tropicalis Pofut1 CAH03710
[GenBank]
and Pofut2 ABA29471
[GenBank]
. When needed, complete CDSs were reconstructed by EST assemblages or searches in specific genomic databases. Alignments were performed using the PC package MUST 2000 developed by Hervé Philippe from the University of Montreal and downloadable at www.isem.univ-montp2.fr/PPP/PM/RES/Info/{at}Softwares.php# MUST2000. They are available from the authors upon request. Positions that could not be confidently aligned due to indels were removed from analyses. We used 302 positions corresponding to 227 parsimony informative characters for Pofut1 comparison and 382 positions (330 parsimony informative characters) for Pofut2. Owing to this selection, H. sapiens and P. troglodytes Pofut1 sequences are identical. Phylogenetic trees were constructed using ML method implemented in the PROTML version 2.3 software (Adachi and Hasegawa, 1996
). ML trees were built by the quick-add Operational Taxonomic Units (OTUs) search, the JonesTaylorThornton matrix (JTT)-f model of amino acid substitution, and retaining the 100 top-ranking trees. Bootstrap proportions were calculated by the resampling estimated log-likelihood (RELL) method (Kishino et al., 1990
) upon all top-ranking trees. A Bayesian analysis based on the posterior probabilities of phylogenetic trees was performed using the MrBayes (v_3.0b4) program (Huelsenbeck and Ronquist, 2001
). A general time-reversible model (Rodriguez et al., 1990
) with a G distribution was applied to take into account among site rate variation. Four Markov chains were used with 500,000 generations, and a tree sampling was taken every 100 generations. All trees sampled before the stationary distribution were discarded as burn-in.
| Supplementary Data |
|---|
|
|
|---|
Supplementary data are available at Glycobiology online (http://glycob.oxfordjournals.org/).
| Conflict of interest statement |
|---|
|
|
|---|
None declared.
| Acknowledgments |
|---|
|
|
|---|
The authors thank the "Centre de Ressources Biologiques en Génomique des Animaux Domestiques et dIntérêt Economique" (CRB GADIE, INRA, Jouy-en-Josas, France) for providing bovine ESTs and BACs, the slaughter house of the INRA centre of Clermont-Ferrand-Theix (France) for animal samples, and M. P. Laforêt for her technical assistance in cell culture. This work has been performed in the frame of the French network "G3." It was supported by "Conseil régional du Limousin," an NIH (GM61126) grant to R.S.H., and a "Ministère de la Recherche et de la Technologie" doctoral fellowship to C.L.
| Abbreviations |
|---|
BAC, bacterial artificial chromosome; BP, bootstrap proportion; CAZy, carbohydrate-active enzymes; CDS, coding sequence; EGF-like, epidermal growth factor-like; ER, endoplasmic reticulum; EST, expressed sequence tag; ML, maximum likelihood; PCR, polymerase chain reaction; Pofut1, protein O-fucosyltransferase 1; Pofut2, protein O-fucosyltransferase 2; PP, posterior probability; RACE, rapid amplification of cDNA end; RT, reverse transcriptase; SSC, sodium chloridesodium citrate buffer; TSR, Thrombospondin type 1 repeat; UTR, untranslated region
| References |
|---|
|
|
|---|
Adachi, J. and Hasegawa, M. (1996) MOLPHY Version, 2.3: programs for molecular phylogenetics based on maximum likelihood. Institute of Statistical Mathematics, Tokyo. Comput. Sci. Monogr., 28, 1150.
Adams, J.C. and Tucker, R.P. (2000) The thrombospondin type, 1 repeat (TSR) superfamily: diverse proteins with related roles in neuronal development. Dev. Dyn., 218, 280299.[CrossRef]







