When the first draft of the human genome was completed in the early 2000s, many hoped the ensuing genomics revolution would usher in a new era in drug development. Researchers would use their understanding of the genome to create personalized “precision medicines” that target diseases at their root genetic causes and have a high likelihood of clinical success. However, this has not been possible for most diseases and only around 13% of all drugs entering clinical trials make it to market (Wong et al 2019). Billions of dollars still go into developing drugs that ultimately fail (Wouters et al 2020). Here we describe how proteomics can radically improve the drug development process.
The proteomics revolution is upon us and has the potential to accelerate the development precision medicines
The promises of proteomics are many. By knowing exactly what proteins are in a given cell, it will be easier to:
• Understand the mechanisms of disease
• Identify possible drug targets
• Make drugs that effectively treat disease with fewer side effects
Until now, the technologies available for measuring the proteome have fallen behind the promise. These technologies classically rely on mass spectrometry. They have pushed the field forward and made the importance of the proteome clear, but they are not accessible to many researchers and generate data that can be difficult to analyze.
This is perhaps unsurprising given the complexity of the proteome and the challenges involved in quantifying it. Large abundance proteins can easily drown out small abundance proteins in individual cells and individual proteins themselves consist of 20 different amino acids. These interact with one another in intricate ways in 3D space.
In addition, single genes can encode many different forms of proteins that can be modified in various ways. We call these many different versions of a single protein “proteoforms.” Thanks to proteoforms, the thousands of protein coding genes in the human genome can generate millions of proteins.
On top of all that, the proteome is in constant flux. Proteins move around the cell and their abundances change over time. While the genome you’re born with is the genome you die with, the proteome changes from day-to-day, cell-to- cell.
At Nautilus, we’ve developing a Proteome Analysis Platform designed to demystify this complexity using single-molecule analysis and machine learning. Our platform aims to leverage massive arrays composed of billions of landing pads that isolate single proteins (one protein per landing pad). These are similar to the ever-improving microchips in your computer. We repeatedly flow fluorescent affinity probes over each protein on an array and these probes selectively bind to motifs found in some fraction of the proteins. Using optical imaging techniques, we can look at each landing pad and determine if binding occurs.
Our platform is designed to measure many proteins at once through affinity probes that bind to multiple protein species, but this also makes binding measurements noisy. This is where machine learning comes in. Sometimes an affinity probe will fail to bind a protein even if it has a target motif. Other times affinity probes will bind to proteins that don’t have their target motifs. Machine learning algorithms are designed to assess these stochastic binding events and determine which of a known set of proteins is most likely on a landing pad. In doing so, we expect our technology will effectively extract signal from noise – something researchers would have a hard time doing on their own.
We aim to achieve unprecedented sensitivity and dynamic range through the simultaneous, repeated analysis of billions of protein molecules. By tallying up the proteins on each landing pad, we hope to build a full picture of the proteome from data collected at the scale of single protein molecules and deliver the user a simple list of proteins and their abundances. The platform is even designed to look at very low-abundance proteins whose signals would be drowned out by high-abundance proteins on other analysis platforms. This is the biological equivalent of finding a needle in a haystack!
Of course, this requires our machine learning tools to build their predictions from known proteins. Toward this end, we are actively working to train our platform. We believe we’ll be able to accurately identify 95% of the human proteome soon.
Democratizing proteomics and fueling precision medicine
With our platform and simple readout, we will democratize proteomics. We aim to make it possible for researchers in biomedicine, regardless of their background, to use our technologies. In fact, we hope to make it possible for anyone who wants a proteome to get one. The platform will store researchers’ proteomic data and give them access to standardized analysis tools.
We expect this accessible proteomics platform will catalyze drug development. Using it, we hope to enable researchers to establish proteomes for any cells they wish. With these in hand, it will be easier to figure out exactly what’s gone awry in diseased cells and create drugs that target diseases at their molecular causes with few side effects.
We believe a wave of precision and personalized medicine is coming soon, fueled by a proteomics revolution. Individuals will feel the benefit of this proteomics revolution in the form of more effective therapies. They’ll think of proteins as drivers of health and not just items listed on cereal boxes.
MORE ARTICLES