In-Depth Proteome Coverage by Iterative Data-Dependent Acquisition on a Benchtop Orbitrap Mass Spectrometer Mathias Müller, Tabiwang N. Arrey, Florian Große-Coosmann, Thomas Rietpietsch, Andreas Kühn, Catharina Crone, Frank Ciesinski, Torsten Ueckert, Markus Kellmann Thermo Fisher Scientific, Bremen, Germany Overview Data Analysis Purpose: Improved identification of low-copy number proteins. Methods: Iterative data-dependent MS/MS bottom-up proteomics on a Thermo Scientific™ Q Exactive™ HF benchtop Orbitrap mass spectrometer. Results: Significantly increasing the dynamic range for protein identification in complex proteome samples, by employing an iterative acquisition strategy. Introduction For large-scale bottom-up tandem MS (MS/MS) protein sequencing techniques, datadependent TopN methods are widely established. Running numerous replicates increases the number of identified proteins only to a certain point, since the method prefers high-intensity peptides (belonging to highly concentrated proteins) for precursor triggering. The sequence coverage of higher abundant proteins is increased, but the probability to identify a higher number of low-intensity peptides is not improved. It was shown, that an iterative approach of excluding previously triggered precursors for the subsequent run allows deeper sequencing of low-copy number proteins [1]. Here, we describe a methodology using extremely rich inclusion and exclusion lists to increase the identification of low-abundant proteins significantly. Methods Thermo Scientific™ Pierce™ HeLa Protein Digest Standard was diluted in HPLC grade H2O (Fisher Scientific) to a final concentration of 0.5 µg/µL. For each run, 1 µg HeLa was injected. TABLE 1. Liquid Chromatography. Chromatography Settings LC Stack Thermo Scientific™ Dionex™ UltiMate™ 3000 RSLCnano system equipped with nano pump NCS-3500 and autosampler WPS-3000TPL Mobile Phases A: 0.1 % FA in water B: 0.1 % FA in Acetonitrile (Fisher Chemicals) Gradients 15 min 5–10 % B; then added to final gradient length of 30, 60, or 90 min 10–40 % B Flow Rate 250 nL/min Trapping Column Thermo Scientific™ Acclaim™ PepMap™100 µCartridge Column C18, 300 μm × 0.5 cm, 5 μm, 100 Å (backflush mode) Separation Column Acclaim PepMap C18, 75 μm × 50 cm, 2 μm, 100 Å TABLE 2. Mass Spectrometry: Q Exactive HF. Settings Master Run Settings Iterative Run Exclusion – on (mass tolerance ±10 ppm) Resolution Full MS 60,000 60,000 AGC Target Full MS 3e6 3e6 Loop Count 20 20 dd Resolution 15,000 15,000 dd Target 1e5 1e5 dd-MS2 max IT FIGURE 1. Flow chart of iter low-abundant precursor ion Master Runs triplicate 1µg HeLa „parallel“ data acq. PD database search fasta Uniprot A9609 Highest # of protein groups? Sample Preparation TopN Properties A triplicate master run was ana 2.0 search engine SEQUEST® Based on the master run with exclusion list was exported as Exactive Series 2.4 instrument Table 2. For the final Proteome were compared against a data 50 ms (“parallel”) 100 ms (“sensitive”) Isolation Window 1.4 m/z 1.4 m/z NCE 28 28 dd Underfill Ratio 5% 5% Peptide Match preferred preferred Dynamic Exclusion 30 s 30 s 2 In-Depth Proteome Coverage by Iterative Data-Dependent Acquisition on a Benchtop Orbitrap Mass Spectrometer yes no data not used for further iteration Results Iterative Approaches 1) Classic: All PSM from a iteration. 2) Alternating bins: The m m/z scan ranges (bins) in The five bins are acquire 3) Merged bins: Five TopN These five raw files are p FIGURE 2. Scatter plot of all retention time (RT) from a 30 divided into five bins indicat MS/MS search input (3284 ea correlating well. roteins. roteomics on a Thermo pectrometer. protein identification in complex n strategy. sequencing techniques, datang numerous replicates tain point, since the method entrated proteins) for precursor proteins is increased, but the eptides is not improved. It was y triggered precursors for the number proteins [1]. nclusion and exclusion lists to ificantly. Data Analysis Comparison of Iterative Approache A triplicate master run was analyzed with Thermo Scientific™ Proteome Discoverer™ 2.0 search engine SEQUEST® HT against Uniprot fasta database human A9609. Based on the master run with the highest number of protein group IDs, a PSM exclusion list was exported as a *.csv file. This exclusion list was imported in the Exactive Series 2.4 instrument TopN method using settings “Iterative Run” from Table 2. For the final Proteome Discoverer Report, the HeLa protein copy numbers were compared against a data sheet [2]. Reducing the number of potential MS range is a common technique to trigg [3]. Here, three different iterative app classic, alternating bins, and merged master run. Gradients of 30 min, 60 m FIGURE 1. Flow chart of iterative approach for data-dependent triggering of low-abundant precursor ions. Master Runs triplicate 1µg HeLa „parallel“ data acq. Highest # of protein groups? dard was diluted in HPLC grade µL. For each run, 1 µg HeLa no data not used for further iteration ltiMate™ 3000 RSLCnano mp NCS-3500 and autosampler er Chemicals) o final gradient length of PepMap™100 µCartridge 5 μm, 100 Å (backflush mode) 50 cm, 2 μm, 100 Å Settings Iterative Run Triplicate Master Iteration 1 Iteration 2 … Iteration N PD database search fasta Uniprot A9609 FIGURE 3. Venn diagrams of SEQU groups (upper panel) and identified master runs and three different ite Gradient length: 60 min. Iterative Classic yes yes PSM m/z added to Exclusion list Ex Series ME „sensitive“ data acq. Further iteration? no PD Report node „copy numbers“ MPI FIGURE 4. Percentage increase or Peptide Groups obtained from two triplicate master run of the same g bins gives optimal results. Results Iterative Approaches 1) Classic: All PSM from a master run are imported as an exclusion list for the next iteration. 2) Alternating bins: The master scan ranges from method 1) are divided into five m/z scan ranges (bins) indicating the same number of PSMs (see Figure 2). The five bins are acquired alternately in a master scan of one TopN method. 3) Merged bins: Five TopN methods are created using one bin of method 2) each. These five raw files are processed to one Proteome Discoverer *.msf file. FIGURE 2. Scatter plot of all peptide spectral matches (PSMs) m/z against retention time (RT) from a 30 min gradient HeLa master run. The scan range is divided into five bins indicating the same number of PSMs (2131 each) and MS/MS search input (3284 each). The distribution of PSM bins and MS/MS bins is correlating well. on (mass tolerance ±10 ppm) IDs of Iterative Runs 10.0 30 min 8.0 6.0 4.0 1.3 2.0 0.0 -2.0 -0.2 -6.0 -8.0 -10.0 -2.9 -4.0 -4.8 -6.0 -8.6 60,000 3e6 20 15,000 1e5 100 ms (“sensitive”) 1.4 m/z 28 5% preferred MSMS bin size: 3284 SEs PSM bin size: 2131 SEs 30 s Thermo Scientific Poster Note • PN-64120-ASMS-EN-0614S 3 cientific™ Proteome Discoverer™ asta database human A9609. f protein group IDs, a PSM usion list was imported in the settings “Iterative Run” from the HeLa protein copy numbers ata-dependent triggering of Comparison of Iterative Approaches with Triplicate Master Runs: Protein IDs Deeper Sequencing Using Itera Reducing the number of potential MS/MS candidates by narrowing the full MS scan range is a common technique to trigger a larger amount of low-intense precursor ions [3]. Here, three different iterative approaches using two iterations each were evaluated: classic, alternating bins, and merged bins. The results were compared to a triplicate master run. Gradients of 30 min, 60 min, and 90 min were run in duplicates. Using an iterative approach, lowTopN experiment. Compared to a with copy numbers ranging from becomes more significant at grad FIGURE 3. Venn diagrams of SEQUEST HT results comparing identified protein groups (upper panel) and identified peptide groups (lower panel) for triplicate master runs and three different iterative methods using two iterations each. Gradient length: 60 min. Iterative Classic Iterative Alternating Bins 500 Iterative Merged Bins acq. Reference Iterative 400 yes to HeLa Protein Co 450 Counts (Zoom) Triplicate Master ation 1 ation 2 … ation N FIGURE 5. Comparison of HeL approach with five merged sca master runs (blue bars) and th Further iteration? Triplicate 350 300 250 200 150 100 50 0 no PD Report node „copy numbers“ MPI FIGURE 4. Percentage increase or decrease of SEQUEST HT Protein Groups and Peptide Groups obtained from two iterations of HeLa runs compared to a triplicate master run of the same gradient length. Using two iterations of merged bins gives optimal results. om method 1) are divided into five mber of PSMs (see Figure 2). ster scan of one TopN method. d using one bin of method 2) each. eome Discoverer *.msf file. atches (PSMs) m/z against master run. The scan range is er of PSMs (2131 each) and n of PSM bins and MS/MS bins is IDs of Iterative Runs compared to Triplicate Master Runs 10.0 30 min 8.0 60 min 5.4 6.0 -0.2 -8.0 -10.0 -3.3 -4.8 -6.0 1.4 1.7 1.0E+0 Reference Iterative 1600 Triplicate 1400 1200 1000 800 600 400 0.7 200 0 90 min -1.0 -1.3 -2.9 -4.0 -6.0 2.8 1.3 2.0 -2.0 6.6 3.8 4.0 0.0 7.5 1.0E+01 HeLa Protein Co 1800 Counts (Zoom) ted as an exclusion list for the next 2000 1.0E+00 1.0E+00 1.0E+01 1.0E+0 -3.0 Benefit Protein Groups ID [%] Benefit Peptide Groups ID [%] 2000 -8.6 1800 1600 Counts (Zoom) 1400 HeLa Protein Co Reference Iterative Triplicate 1200 1000 800 600 400 MSMS bin size: 3284 SEs PSM bin size: 2131 SEs 4 In-Depth Proteome Coverage by Iterative Data-Dependent Acquisition on a Benchtop Orbitrap Mass Spectrometer 200 0 1.0E+00 1.0E+01 1.0E+0 Bin: by narrowing the full MS scan unt of low-intense precursor ions wo iterations each were evaluated: s were compared to a triplicate were run in duplicates. s comparing identified protein ps (lower panel) for triplicate using two iterations each. Preferred Triggering of Low-inte Using an iterative approach, low-intense precursor ions are preferentially triggered in a TopN experiment. Compared to a standard triplicate run, a larger amount of proteins with copy numbers ranging from 1e3 to 1e5 will be identified (see Figure 5). This effect becomes more significant at gradients longer than 60 minutes. Histograms in Figure 6 indicate tha intensities smaller than 1e6. The h approach derive from the first mas exclusion list for the subsequent ite FIGURE 5. Comparison of HeLa protein copy numbers [2] using an iterative approach with five merged scan range bins (yellow bars) compared to triplicate master runs (blue bars) and the reference database results (grey bars). FIGURE 6. Histogram comparing master run (blue bars) with itera complete distribution for 30, 60, zoomed y axes. 500 Iterative Merged Bins HeLa Protein Copy Numbers 30 min 10000 25000 9000 8000 7000 Reference 450 20000 6000 5000 4000 15000 3000 2000 Iterative 400 Counts (Zoom) ative ting Bins Deeper Sequencing Using Iterative Approach Counts e Master Runs: Protein IDs 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) 0 300 250 Bin: Intensity 200 150 50000 100 40000 0 30000 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 5.4 6.6 3.8 1.0E+07 1.0E+08 1.4 Bin: Intensity 9000 8000 Counts Reference 6000 5000 4000 3000 2000 Iterative 1000 0 1.0E+00 1.0E+01 Triplicate 1400 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) 1200 1000 800 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) Benefit Protein Groups ID [%] Benefit Peptide Groups ID [%] 2000 1800 1600 1400 Iterative 90 min Triplicate 90 min Conclusion 200 90 min 70000 60000 50000 40000 30000 20000 10000 0 Bin: Intensity 600 400 0.7 10000 10000 HeLa Protein Copy Numbers 90 min 9000 8000 7000 Reference Iterative Triplicate Compared to a standard triplicate T approach reveals to following bene Optimal iterative conditions us 10000 Counts 1.7 Triplicate 60 min 0 7000 1600 Counts (Zoom) 2.8 HeLa Protein Copy Numbers 60 min 1800 Counts (Zoom) 7.5 2000 Iterative 60 min 20000 Bin: Copy Number (1/4 Order of Magnitude) plicate Master Runs 10000 5000 50 QUEST HT Protein Groups and eLa runs compared to a Using two iterations of merged Triplicate 30 min 1000 Triplicate 350 Iterative 30 min 6000 5000 4000 3000 2000 1000 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) 1200 Higher number of protein grou Deeper sequencing in a prote References 1. Chen, H.; Rejtar, T.; Andreev, 77, 7816–7825. 2. Kulak, N. A.; Pichler, G.; Paro 319–324. 1000 800 3. Scherl, A.; Shaffer, S. A.; Taylo R. Anal Chem. 2008, 80(4), 11 600 400 200 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) SEQUEST is a registered trademark of the U Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage intellectual property rights of others. Thermo Scientific Poster Note • PN-64120-ASMS-EN-0614S 5 Preferred Triggering of Low-intense Precursor Ions in a TopN Experiment are preferentially triggered in a , a larger amount of proteins tified (see Figure 5). This effect inutes. Histograms in Figure 6 indicate that the iterative approach is effective for precursor ion intensities smaller than 1e6. The higher intensity precursor ions of the iterative approach derive from the first master run, which is necessary to define a global exclusion list for the subsequent iterations. ers [2] using an iterative bars) compared to triplicate results (grey bars). FIGURE 6. Histogram comparing all PSM precursor ion intensities of triplicate master run (blue bars) with iterative runs (red bars). Left panels show the complete distribution for 30, 60, and 90 min gradients. Right panels show zoomed y axes. min 10000 8000 7000 Counts 3000 25000 9000 20000 6000 5000 4000 15000 3000 2000 Iterative 30 min 2500 Iterative 30 min Triplicate 30 min 2000 Triplicate 30 min 1500 1000 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) 10000 1000 5000 500 0 0 Bin: Intensity Bin: Intensity 10000 50000 40000 30000 0E+05 1.0E+06 1.0E+07 1.0E+08 Magnitude) min Iterative 60 min Triplicate 60 min 8000 6000 20000 4000 10000 2000 0 0 9000 Triplicate 60 min Bin: Intensity Bin: Intensity 10000 Iterative 60 min 8000 Counts 7000 6000 5000 4000 3000 2000 1000 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) 70000 60000 50000 40000 30000 20000 10000 0 10000 Iterative 90 min Triplicate 90 min 8000 6000 Iterative 90 min Triplicate 90 min 4000 2000 0 Bin: Intensity Bin: Intensity Conclusion 0E+05 1.0E+06 1.0E+07 1.0E+08 Magnitude) min Optimal iterative conditions using smaller scan ranges for master scans Higher number of protein groups ids 10000 9000 Deeper sequencing in a protein copy number range between 1e3 and 1e5 8000 7000 Counts Compared to a standard triplicate TopN experiment on Q Exactive HF, the iterative approach reveals to following benefits: 6000 5000 4000 3000 2000 1000 0 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 Bin: Copy Number (1/4 Order of Magnitude) References 1. Chen, H.; Rejtar, T.; Andreev, V.; Moskovets, E.; Karger, B. L. Anal. Chem. 2005, 77, 7816–7825. 2. Kulak, N. A.; Pichler, G.; Paron, I.; Nagaraj, N.; Mann, M. Nat. Methods 2014, 11, 319–324. 3. Scherl, A.; Shaffer, S. A.; Taylor, G. K.; Kulasekara, H. D.; Miller, S. I.; Goodlett, D. R. Anal Chem. 2008, 80(4), 1182–1191. 0E+05 1.0E+06 1.0E+07 1.0E+08 of Magnitude) SEQUEST is a registered trademark of the University of Washington. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. PO64120-EN 0614S 6 In-Depth Proteome Coverage by Iterative Data-Dependent Acquisition on a Benchtop Orbitrap Mass Spectrometer www.thermoscientific.com ©2014 Thermo Fisher Scientific Inc. All rights reserved. ISO is a trademark of the International Standards Organization. SEQUEST is a registered trademark of the University of Washington. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is presented as an example of the capabilities of Thermo Fisher Scientific products. It is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. Specifications, terms and pricing are subject to change. Not all products are available in all countries. Please consult your local sales representative for details. Africa +43 1 333 50 34 0 Australia +61 3 9757 4300 Austria +43 810 282 206 Belgium +32 53 73 42 41 Canada +1 800 530 8447 China 800 810 5118 (free call domestic) 400 650 5118 Denmark +45 70 23 62 60 Europe-Other +43 1 333 50 34 0 Finland +358 9 3291 0200 France +33 1 60 92 48 00 Germany +49 6103 408 1014 India +91 22 6742 9494 Italy +39 02 950 591 Japan +81 45 453 9100 Latin America +1 561 688 8700 Middle East +43 1 333 50 34 0 Netherlands +31 76 579 55 55 New Zealand +64 9 980 6700 Norway +46 8 556 468 00 Russia/CIS +43 1 333 50 34 0 Thermo Fisher Scientific, San Jose, CA USA is ISO 9001:2008 Certified. Singapore +65 6289 1190 Spain +34 914 845 965 Sweden +46 8 556 468 00 Switzerland +41 61 716 77 00 UK +44 1442 233555 USA +1 800 532 4752 PN-64120-EN-0614S
© Copyright 2024 ExpyDoc