N.E.B. and B.T.I. - GlycanAnalyzer

Images below are marked in grey boxes.

Input formats

There are three possible input formats to use in GlycanAnalyzer. The files can be easily uploaded by clicking the 'Upload' button. Each of the three file formats can be generated using the two links:

merging 3D mass and peak information
supplying peaks and mass

Input format 1: Peak list and 3-dimensional mass input

Tutorial 1 has a step by step manual on how to use this input format.

To enter this type of data into the software you need to click the 'merging 3D mass and peak information' link seen in the screenshot below:

After clicking this link you will be asked to build an exoglycosidase panel.

Building the exoglycosidase panel

Each exoglycosidase must have its own set of UPLC data and MS1 data. Required is a 'main profile' this is often the undigested profile but it can be any starting profile. The screenshot below shows the available exoglycosidases on the left. The ones you are using are on the right.

Tip 1: The order that exoglycosidases are added should be the same as the order you applied them in your experiment.
Tip 2: if you are unsure there is an example provided by clicking the link: 'Click here to paste an example'

Entering your data

After your exoglycosidase panel is set up you can enter your 3-dimensional mass list and peak list into the text boxes as displayed below. The screenshot below shows the textfields where the user can enter the MS1 and UPLC data. The left textfield allows the users to enter the 3D mass list (i.e. the MS1). The 3D mass list must contain retention time, observed mass and itensity columns. In the right textfield the user can enter the peak list. The peak list must contain Retention time, % amount and the peaks Glycan Units. Note that all columns should be separated by tabs (i.e. the tab button on your keyboard).

Tip 3: it is easier to copy and paste the data from spreadsheet software such as Excel.
Tip 4: if you are unsure there is an example provided by clicking the link: 'Click here to paste an example'

Input format 2: Peak list and user defined mass list

Tutorial 2 has a step by step manual on how to use this input format.

To enter this type of data into the software you need to click the 'suppling peaks and mass' link seen in the screenshot below:

Building the exoglycosidase panel

See input file format 1 above for help with this panel.

Entering your data

After your exoglycosidase panel is built you can enter your peak list with annotated mass and charge states. A single textfield will appear where the user can enter the UPLC peaks (defined by GU and % area) and the extracted MS1 data (defined by mass and charge). The mass and charge were determined by the user from the mass spectra but these can be ignored by suppliying the 'NA' token. Note that the columns GU, %area, mass and charge should be separated by tabs (i.e. the tab button on your keyboard).

Tip 5: it is easier to copy and paste the data from spreadsheet software such as Excel.
Tip 6: if you are unsure there is an example provided by clicking the link: 'Click here to paste an example'

Input format 3: Peak list and NO mass observed

Tutorial 3 has a step by step manual on how to use this input format.

This is identical to Input format 2 (see above) except the 'Observed mass' and 'Observed charge' columns are replaced with 'NA' values (i.e. not available).

Exoglycosidases which ones to select and what order?

Specificities

An exoglycosidase panel is an array of exoglycosidases applied in a particular order. Exoglycosidases are enzymes that release monosaccharides from the non-reducing of N-glycans. They not only have specificity toward monosaccharide units but also cleave specific anomeric alpha/beta configurations and glycosidic linkage. Given that the UPLC and MS peaks change in response to exoglycosidase application, it is clear that the sequential application of an array of exoglycosidase will give the necessary information to infer exact structure of any N-glycan.

The above image shows linkage and monosaccharide specificity of the exoglycosidases supported in GlycanAnalyzer. Support for exoglycosidases currently in development will be implemented in future versions.

Order of application

Exoglycosidases cleave monosaccarides from non-reducing end to reducing end. For this reason order of application is important. For example, in the above image applying a BTG enzyme first will not cleave any β galactose until all Neu5ac, Neu5Gc and α galactose are first removed

Peak selection.

Your chromatogram is separated into liquid chromatography peaks each marked by a GU and area. Each peak can be selected one-by-one from the 'select peak' drop menu (see screenshot below).

Another option is to assign N-glycans to the entire chromatogram. For this two options are available in the drop menu: 'Assign all top hit' and 'Assign all top 5'. This will return the best ranked N-glycan assignment and the best 5 ranked N-glycan assignment, for all your peaks, respectively.

Select a Glycoprotein.

Our database GlycoStore (recently published) in collaboration with the authors of GlycoBase (NIBRT Ireland) contains experimental GU evidence for over 800 N-glycans. In each database it is also known what glycoprotein or sample each N-glycan originated from. This is a very valuable piece of information as it can constrain our search when the user knows what type of sample they are analysing.

Currently, GlycanAnalyzer has human IgG library stored internally. In the future, other libraries will be added to improve N-glycan assignment. If the user selects a library such as human IgG in the input page (e.g. perhaps they are analyzing monoclonal antibodies) then the software only considers evidence from IgG, IgG1, IgA, etc. in our database thus it will not consider N-glycans with more than two galactose or more than two sialic acid monosaccharides thus greatly reducing the computational time and increasing the accuracy of the calculation.

Indicating an additional mass.

N-glycans are commonly labelled with a fluorescent molecule which allows fluorescence-based quantitation after chromatographic separation. Using fluorescent labels adds the mass of the labelling molecule to the MS1 therefore users should indicate what additonal mass is possible.
They can do so after clicking 'Merge data to input file' where they will be asked if their MS data contains an additional mass (screenshot below).

Users should select one of three things: (i) select a label supported by GlyanAnalyzer (2AB, RFMS, procanaminde), (ii) select 'other' and enter the labels mass explicitly or (iii) select label free.

Calculating the N-glycan assignments.

To assign N-glycans to your chromatogram Click the button 'Get Glycan List' and wait for the assignment to complete. Note that your PC is not being used for the calculation. All computations are taking place on our state-of-the-art servers.

Single peak assignments usually take 1-2 minutes. Assigning full chromatograms can take 20-30 minutes but may take longer for very complex samples.

Output

Single peak output

Tutorial 1 has a step by step manual showing how to interpret the single peak output.

GlycanAnalyzer's output has two features that help users to accept or reject the assignments.

The first is the definition of a score used to rank the N-glycan assignments (1A screenshot below).
The second is the incorporation of multiple sets of evidence for the assignment. The evidences include:
- the GU similarity to known average GU values in GlycoStore (ΔGU in 5A below),
- matching theoretical glycan masses to observed masses found in the MS spectra (Expected/observed mass in 5A and MS images available in 'main panel mass evidence' in 8A), and
- tracing the sequential shifting of mass and GU digested peaks (available through 'Score calculation' 8A and seen in B below).

How both mass and GU shifts contribute to the score at each stage of the exoglycosidase array can be accessed in the 'score calculation' link (8A above). Assignments can be rejected by the user, for example the α-galactose structure can be rejected (2A above) as it is ranked quite badly as seen by its high score.

Links to databases GlycoStore and GlyTouCan (Aoki-Kinoshita, et al., 2016 are also available to direct the user to further information for the assignment (6A above).

The output is graphical in nature with each N-glycan shown in established CFG/Oxford diagram form (4A above). The shifting peaks can be visualized as a directed graph (in 'peak shift graph' 8A).

Other graphical outputs include the highlighted shifting UPLC peak with corresponding mass spectra, both highlighted in red, with glycan signals annotated (5B above).

When we are finished rejecting candidates and are happy with the current set of candidates we can click the 'Accept structures into chromatogram' button. In this case the completed peaks will be updated (7A in screenshot above). Then, clicking 'View chromatogram' button (7A above) will bring the user to the summary page.

Tip 7: Sometime no shifts can be found in this case there will be no 'Peak shft graph' or 'Score calculation' information.

The links available in 8A in the screenshot above are described in the following:

Peak shift graph:

This presents a directed graph showing the estimated peak movement. The size of a circle is proportional to the area of the peak. Clicking the circles reveals the glycan structure assigned to the peak in each digestion profile. An example with digested glycans is shown below:

Score calculation:

This is perhaps the most important information revealed by GlycanAnalyzer (see image below). It reveals the following pieces of information:

Expected m/z is present in all digestions.
The peak shift on the digestion array chromatograms.
The contributions to the score.

The screenshot above shows a score calculation for an example peak. The total score is 0.1714 and it is calculated from the following columns with the headings titled:

ΔGU score = 0.0022,
Mass score = 0 (all masses found in all profiles), and
Shift score = 0.0092 + 0.0383 + 0.0463 + 0.0754.

Note if the expected mass is not similar to the observed mass at any stage of the exoglycosidase array then the 'Mass score' column for that digestion would be 1.

Main mass panel evidence:

This is simply the mass spectra of the N-glycan(s) which were assigned to the current peak. An example is provided below where the red bar indicates possible m/z for the N-glycan in all charge states (proton charges in this case). Mass spectra images are derived from the input supplied by the user when the supply data through the 'merging 3D mass and peak information' button.

Summary page

Tutorial 1 and 2 has a step by step manual on how to interpret the summary output.

This page summarizes all the peak assignments accepted by the user thus far. The user still has the option to reject the assignments in this page.

An example screenshot of a completed N-glycan assignment to a monoclonal antibody is shown in the screenshot below. The bar chart shows each peaks relative abundance where a tick indicates user acceptance of the peak. Each bar can be clicked to reveal the peak shifting output. The pie charts give the distribution of sialic acids, GlcNAc antennae, galactose and other monosaccharides among others.

The tables present the N-glycans already assigned in the user accepted peaks. GlycanAnalyzer does not use threshold values to assign confidence levels to its annotations. Instead, three confidence levels for each glycan annotation is defined as follows:

Weak: There are no mass or peak shifts detected. Glycans are assigned using similarity to database GU values only.
Medium: There is a mass detected and a similar GU in the database for the glycan annotation. However, no peak shifts could be detected - this often happens for smaller peaks.
High: There was a mass detected, a similar GU found in the database and the peaks movements could be traced from the Undigested to profile to the final exoglycosidase applied.

An example summary page is shown below. Part (a) shows that peak 3 has two pieces of evidence an observed mass and similar GU to average evidence in public databases. Part (b) shows that peak 4 has three pieces of evidence: mass, shifting peaks and similar GU to database GU values. Peak 4's N-glycan assignment can be considered to have high confidence while peak 3's has medium confidence. The weakest level of supporting evidence is GU similarity alone.

Clicking on the button 'Mass' returns the mass spectrum for this peak and possible annotated m/z peaks for the current glycan. Clicking on the 'Shifts' radio button returns the score calculation and all possible UPLC and MS peak shifts. The 'Glucose unit' radio button can be clicked to visualize the GU evidence in glycostore.

Summary page action: Score threhold

This allows the user to lower the score threshold and remove any glycan with a score above this threshold. The peaks will be removed by clicking the 'Update assignments' button.

Summary page action: High confidence assignments

this sets the score threshold to 0.1 and glycans can be removed by clicking the 'Update assignments' button. The threshold was determined on our benchmarking experiments.

Summary page action: Highly abundant assignments

this will reject any glycan that did not elute with a peak area > 1%. It was observed in our benchmarking that low peak areas (< 1%) can be difficult to assign. The peaks will be removed by clicking the 'Update assignments' button.

Summary page action: Redo peaks

The peak does not look quite right to you. Go back and do the peak again in the input page. This often happens when the glycan is also annotated in another peak but with much higher confidence. In this case we would suggest redoing this peak and rejecting the glycan appearing in another peak with a better score (i.e. closer to 0).

Tool Name

Input formats

Input format 1: Peak list and 3-dimensional mass input

Building the exoglycosidase panel

Entering your data

Input format 2: Peak list and user defined mass list

Building the exoglycosidase panel

Entering your data

Input format 3: Peak list and NO mass observed

Exoglycosidases which ones to select and what order?

Specificities

Order of application

Peak selection.

Select a Glycoprotein.

Indicating an additional mass.

Calculating the N-glycan assignments.

Output

Single peak output

Peak shift graph:

Score calculation:

Main mass panel evidence:

Summary page

Summary page action: Score threhold

Summary page action: High confidence assignments

Summary page action: Highly abundant assignments

Summary page action: Redo peaks