Glycopeptide Identification with Byonic™ Software

Glycopeptide Identification
with Byonic™ Software
Marshall Bern
www.proteinmetrics.com
Protein Metrics – ASMS Asilomar, Oct. 2014
Byonic™ is an advanced proteomics search engine
Byonic
7000
# No Mods
6000
# Common Mods
(^q,^c,m,n,q)
4000
3000
2000
1000
1. Higher Sensitivity
# Spectra
5000
11821
45511
14152
47603
ABRF iPRG Study on PTM finding, 2012
52781
14151
74564
23117
34284i
92653
87048i
40104i
23068
77777i
42424i
97053i
94158i
87133i
58409
11211
93128i
33564
58288v
71755v
0
2. Topdown Proteomics
IgG1 Fc with glycan and M[+16]’s
2
Protein Metrics – ASMS Asilomar, Oct. 2014
Byonic™ is an advanced proteomics search engine
3. “Expert system” annotation
216 Da = Characteristic peak
for phospho-Tyrosine
4. Multiple Identifications per MS2 scan
Protein Metrics – ASMS Asilomar, Oct. 2014
3
Byonic™ is an advanced proteomics search engine
5. Glycopeptides!
Protein Metrics – ASMS Asilomar, Oct. 2014
4
Outline for Rest of the Talk
1. How does Byonic work?
2. How to run Byonic
3. Interesting examples
Protein Metrics – ASMS Asilomar, Oct. 2014
5
Byonic score is a sum of benefits and penalties
Mouse brain synaptosome, run on Q-Exactive
Data from Kati Medzihradszky, UCSF
AT1B2_MOUSE peptide with man6
204 Da for HexNAc
Y1 ion
~y12 means loss of full glycan
y12 means glycan on
Protein Metrics – ASMS Asilomar, Oct. 2014
6
Byonic annotates glycan (and peptide + glycan) peaks that match
glycopeptide pieces for any reasonable glycan topology
•
Byonic will annotate an HCD peak at 657 Da for any glycan with HexNAc, Hex, NeuAc;
512 Da for any glycan with HexNAc, Hex, and Fuc; etc.
•
Byonic will annotate Pep+HexNAc for any glycopeptide with HexNAc
Mouse brain Contactin-1 glycopeptide
HexNAc(5)Hex(4)Fuc(2)
512 Da = Diagnostic peak for
antennal fucose
Protein Metrics – ASMS Asilomar, Oct. 2014
7
Is it possible to localize
O-glycans with CID ?
HexNAc(2) Hex(2) NeuAc(2) = 1312.455 Da
rEPO made in HEK cells
Run on Orbitrap Velos
HCD NCE = 35%
Data from Khoo, Academia Sinica, Taiwan
•
~y11 + 2 HexNAc rules out
two HexNAc-Hex-NeuAc’s
•
Byonic placed the O-glycan
correctly (AASAA not AISPP)
on 178 of 185 observations
of the peptide
~y11 + HexNAc
Pep_1+
~y11 + HexNAc Hex
~y12 + HexNAc
Protein Metrics – ASMS Asilomar, Oct. 2014
Pep – 18
~y11 + 2 HexNAc
8
Outline for Rest of the Talk
1. How does Byonic work?
2. How to run Byonic
3. Interesting examples
Protein Metrics – ASMS Asilomar, Oct. 2014
9
Modification Fine Control
At most 3 common
modifications per peptide
At most 1 rare
modification per peptide
Allow up to three M[+16]’s
but only one W[+32]
Protein Metrics – ASMS Asilomar, Oct. 2014
10
Glycan modifications follow the same rules
Pre-defined database of
~300 N-glycan compositions
Rare1  at most one per peptide
A peptide can have one man5,
one man6, and one man7,
along with a rare modification
Protein Metrics – ASMS Asilomar, Oct. 2014
11
Glycan database size and modification fine control
can have a huge impact on search size!
HIV gp120 sequence:
…GK.LICTTAVPWNASWSNK.SLEDIWDNMTWMQWER.EIDNYTNT…
•
300 N-glycans, all rare1, total rare max 1, no missed cleavages
 LICT…NK peptide gives ~600 glycoforms
•
300 N-glycans, common1, total common max 3, no missed cleavages
 LICT…NK peptide gives 300 × 300 = 90,000 glycoforms
•
300 N-glycans, common1, total common max 3, one missed cleavage
 LICT…ER peptide gives 300 × 300 × 300 = 2.7 x 107 glycoforms
Protein Metrics – ASMS Asilomar, Oct. 2014
12
Glycan database size and modification fine control
can have a huge impact on search size!
Human Ig Alpha-1 Sequence:
…HVK.HYTNPSQDVTVPCPVPSTPPTPSPSTPPTPSPSCCHPR.LSH…
•
10 O-glycans, common5, Total common max = 5  HYT…PR peptide
gives (12 choose 5) × 510 = 7,434,375,000 glycoforms
• 100 different sums of O-glycan masses, rare1  HYT…PR peptide
gives 1200 glycoforms (but no site localization)
•
Can search anything in between – e.g., try to resolve the O-glycan on
only the first and last S/T
Protein Metrics – ASMS Asilomar, Oct. 2014
13
Glycan databases
• Use any text editor to write/edit
• Use literature, prior knowledge,
glycomics experiments, etc.
• Setting the database to common1
is the same as setting each glycan
in it to common1.
• Big impact on performance!
• Byonic keywords:
Hex, HexNAc, NeuAc, NeuGc, Fuc, dHex,
GlcNAc, GalNAc, Man, Glc, Gal, Pent, Xyl,
GlcA, IdoA, Sodium, Na, Sulfo, Methyl,
Acetyl, Phospho, DiNAcBac, Pseudaminic,
Legionaminic, Etn, pEtn, …
Protein Metrics – ASMS Asilomar, Oct. 2014
14
One more option, meant for “one-protein” samples
Glycopeptide assignments will be shown
regardless of Byonic score or FDR
Protein Metrics – ASMS Asilomar, Oct. 2014
15
One more option, meant for “one-protein” samples
Uncheck this box
Glycopeptide assignments will be shown
regardless of Byonic score or FDR
Protein Metrics – ASMS Asilomar, Oct. 2014
16
Practical tips for Glycoproteomics
• “Right size” the protein database
• Include all the proteins in the sample, but not too many more.
• “Right size” the glycan database by iterative search
• Start from a small glycan database and work up.
• Start large and work down by removing glycans with little support.
• Be careful with “Total common max” – start small (like 1 or 2) and
work up only if you need to
• Manually check all glycopeptide assignments prone to error
• Glycans containing 2 or more Fuc’s or 1 Fuc and 1 NeuGc
• Glycopeptides with close-together oxidations and glycosylations
• Glycopeptides with 2 or more potential glycosylation sites
Protein Metrics – ASMS Asilomar, Oct. 2014
17
Outline for Rest of the Talk
1. How does Byonic work?
2. How to run Byonic
3. Interesting examples
Protein Metrics – ASMS Asilomar, Oct. 2014
18
Bacterial Glycosylation
Campylobacter jejuni, run on Orbitrap Velos with HCD fragmentation, HCD NCE = 45%
Data from Nick Scott, UBC Nothaft et al, Mol Cell Proteomics 11 (11), 2012
HexNAc(5)Hex(1) diNAcBacillosamine(1)
Protein Metrics – ASMS Asilomar, Oct. 2014
Y1_1+
Glycopep_2+
Glycan contains diNAcBac  Byonic looks for
Y1 = Peptide + diNAcBac (228 Da)
19
O-glycosylation (and N-terminal alkylation!) on MHC class I peptides
Human B cell line, run on Orbitrap Elite with EThcD fragmentation, HCD NCE = 32%
Data from Albert Heck, Utrecht Mommen et al, PNAS 111 (12), 2014
from RNA-binding protein 27
(cytoplasm and nucleus!)
Pep_1+
[+28]IPRPPIT[+365]QSSL
[+28]IPRPPIT[+656]QSSL
Pep_1+
Note: [+14/+28]RPPIT[+203]QSSL (called O-GlcNAc)
found in leukemia samples [Hunt lab, ASMS 2014]
Protein Metrics – ASMS Asilomar, Oct. 2014
20
O-glycosylation (without N-terminal alkylation) on MHC class I peptides
Human B cell line, run on Orbitrap Elite with EThcD fragmentation, HCD NCE = 32%
Data from Albert Heck, Utrecht Mommen et al, PNAS 111 (12), 2014
IPRPPIT[+365]QSSL
c10
Pep_1+
from RNA-binding protein 27
(cytoplasm and nucleus!)
Protein Metrics – ASMS Asilomar, Oct. 2014
26
Summary
•
•
Byonic offers:
•
Glycopeptide search in a full-service proteomics search engine
•
Support for all types of fragmentation and mass analysis
•
Special features for one-protein samples
Byonic is keeping up with MS hardware.
•
If you can fragment it, we can identify it!
Protein Metrics – ASMS Asilomar, Oct. 2014
www.proteinmetrics.com
22