Glycobiology Advance Access originally published online on May 9, 2007
Glycobiology 2007 17(8):868-876; doi:10.1093/glycob/cwm050
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NetCGlyc 1.0: prediction of mammalian C-mannosylation sites
Department of Medical Biochemistry and Biophysics, Karolinska Institutet, SE-171 77 Stockholm, Sweden and Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-106 91 Stockholm, Sweden
To whom correspondence should be addressed; Tel: +46 8 52486976; fax: +46 8 313445; e-mail: karin.julenius{at}sbc.su.se
Received on March 21, 2007; revised on May 3, 2007; accepted on May 5, 2007
C-mannosylation is the attachment of an
-mannopyranose to a tryptophan via a C–C linkage. The sequence WXXW, in which the first Trp becomes mannosylated, has been suggested as a consensus motif for the modification, but only two-thirds of known sites follow this rule. We have gathered a data set of 69 experimentally verified C-mannosylation sites from the literature. We analyzed these for sequence context and found that apart from Trp in position +3, Cys is accepted in the same position. We also find a clear preference in position +1, where a small and/or polar residue (Ser, Ala, Gly, and Thr) is preferred and a Phe or a Leu residue discriminated against. The Protein Data Bank was searched for structural information, and five structures of C-mannosylated proteins were obtained. We showed that modified tryptophan residues are at least partly solvent exposed. A method predicting the location of C-mannosylation sites in proteins was developed using a neural network approach. The best overall network used a 21-residue sequence input window and information on the presence/absence of the WXXW motif. NetCGlyc 1.0 correctly predicts 93% of both positive and negative C-mannosylation sites. This is a significant improvement over the WXXW consensus motif itself, which only identifies 67% of positive sites. NetCGlyc 1.0 is available at http://www.cbs.dtu.dk/services/NetCGlyc/. Using NetCGlyc 1.0, we scanned the human genome and found 2573 exported or transmembrane transcripts with at least one predicted C-mannosylation site.
Key words: C-mannosylation / machine learning / neural networks / prediction