Combining fragmentation strategies for next-generation proteomics

Proteomics – the large-scale study of proteins, and how they interact – underpins modern biology and medicine. Mass spectrometry, an analytical technique that identifies protein sequences through breaking peptide chains into smaller segments and analysing their mass, allows scientists to determine which proteins are present in biological samples, and helps them to determine their functions.

Obtaining the order of amino acid units in the primary chain of a protein is now relatively straightforward. In order to understand the context and function of a protein, however, one must also characterise its post-translational modifications (PTMs) – chemical changes that occur after the primary chain of the protein is synthesised. PTMs are nature’s way of modifying protein behaviour, and cover an enormous range of chemical and structural properties. Proteins that carry complex PTMs that regulate biological function still represent a major challenge for proteomics, and many proteins remain difficult to characterise in full detail using existing methods.

A study published today in Nature Methods reports a new approach that brings together advances in instrumentation and machine and deep learning to address this challenge in proteomics methods.

This study is part of a longstanding project led by Prof Shabaz Mohammed at the University of Oxford in collaboration with the Rosalind Franklin Institute and international partners. The team have built a hybrid mass spectrometry platform that combines a variety of analytical techniques in sequence, alongside new machine-learning and deep-learning algorithms, to analyse a wide range of biological substrates.

In a typical mass-spectrometry experiment, proteins are chopped into smaller pieces called peptides using enzymes, ionised, and analysed based on their mass-to-charge ratios. To determine the amino acid sequence of a peptide, these ions must be further fragmented in controlled ways, producing patterns that can be interpreted computationally. Traditionally, most workflows have relied on a single dominant fragmentation approach: rapid heating through gas phase collisions. This approach has been successful, but its use is more limited when PTMs have altered the fragmentation behaviour of the peptide to the point that meaningful information retrieval is not possible. Alternative fragmentation approaches have been developed, but their implementation has been challenging, and uptake has been modest.

This situation has presented a particularly limiting challenge, since no single fragmentation method captures all aspects of sequence equally well, and often multiple sets of experiments are required. The team’s new platform now integrates multiple complementary methods within one instrument, combining fragmentation driven by infrared light, ultraviolet light and electron sources. By enabling these different modes to be applied selectively and in a controlled manner, the system can generate richer and more informative datasets from the same sample.

omnitrap1

The Omnitrap attached to the rear of an Exploris 480 mass spectrometer. Image: Prof Shabaz Mohammed.

Their approach uses the Omnitrap, a segmented ion trap that allows different fragmentation processes to be applied in distinct regions of the instrument. This modular design makes it possible to optimise each technique independently, before coupling the results to a high-resolution Orbitrap analyser. The result is a system that moves beyond both bespoke experimental setups and current commercial instruments, combining flexibility with practical utility.

Analysing the complex experimental data was a key challenge in developing the platform. The researchers developed a deep learning model capable initially of annotating, and ultimately of predicting, fragmentation patterns across multiple techniques. Trained on large-scale datasets generated using the instrument, the model improves the identification of proteins and peptides and enables more effective analysis of complex mixtures. This novel algorithm has been made available freely to the wider scientific community, for use on data from any mass spectrometry instrument.

The implications of the team’s work are wide-ranging. The ability to combine multiple fragmentation strategies improves coverage of protein sequences and enhances the detection of the PTMs that control protein function and are central to many diseases. Their instrumental approach is applicable to a range of biomolecules, including nucleotides, glycosylated proteins, lipids and intact proteins, and the hope is that it will be able to support studies from basic biology to drug development.

https://www.youtube.com/embed/ENikeverAJM

Looking ahead, the team aims to extend the capabilities of the platform further, with ambitions including developing approaches for highly complex PTM contexts such as viral glycoproteins, as well as aiming to improve efficiency and sensitivity towards single cell levels.

Co-lead author Prof Shabaz Mohammed said:

The development of this platform and data analysis tools now allows us to explore biomolecules that have proven elusive to mass spectrometry. The world of PTMs is multiple times more complex than the humble protein sequence, and it is exciting we can now work towards identifying, characterising and studying proteins closer to their true form in the cell.

Co-lead author Prof Mathias Wilhelm, of the Technical University of Munich, said:

Only by combining deep experimental datasets with modern machine learning could we unlock the full potential of these advanced fragmentation methods. This work shows how powerful wet- and dry-lab collaboration can be. What excites me is that researchers can now explore richer fragmentation methods within familiar analysis pipelines.

You can read more about the team’s new method in Nature Methods.