The Human Proteome Project is an ambitious effort to find and characterize every single protein in the human body. It’s building on the template of the Human Genome Project, which sequenced the human genome and provided invaluable information to scientists and medical researchers. Doing the same thing for the human proteome, the collection of all proteins in the human body, could reveal how nearly all biological processes occur and enable the development of new biomarkers, drug targets, and more.
The Human Proteome Project (HPP), run by the Human Proteome Organization, is a coalition of researchers who dedicate much of their time and effort to add to our knowledge of the human proteome. They have been regularly publishing updated versions of the human proteome, adding new proteins every time. The most current version of the HPP has identified 18,397 human proteins, or about 93% of the human proteome. There are predicted to be about 1,400 proteins still missing from the database, meaning proteins that are encoded for by genes, but that haven’t been found.
Researchers within the Human Proteome Project are working to find these missing proteins, quantify them, and determine their functions. These proteomics efforts have moved at an impressive pace since the start of the Human Proteome Project and have been accelerated by steady advances in the capabilities of mass spectrometers and other traditional protein analysis technologies. The successes of the Human Proteome Project are incredibly admirable, but there is much work to be done to fully uncover the human proteome. While researchers have found most proteins somewhere in the body, we still need comprehensive measures of protein quantity and function across all cells of the body.
Thankfully, the work of the Human Proteome Project has highlighted how essential proteomics is and the project has played an important role in spurring the development of next-generation proteomics technologies as part of the proteomics revolution. The project has revealed there are complex variations in protein levels, proteoforms, and protein-protein interaction across cells. We need technologies that make it possible to routinely probe these variations to understand the mysteries of the proteome.
Next-generation tools like the NautilusTM Proteome Analysis Platform are designed to deliver a step-change in researchers’ ability to analyze the proteome. We believe our platform will put the audacious goals of the Human Proteome Project within reach and enable researchers to routinely measure the full proteome. Through platforms like ours, we hope researchers will garner a thorough understanding of the dynamic roles proteins play throughout the body and learn to apply that understanding for novel diagnostics, therapies, biotechnologies, and much more.
Why study the human proteome?
The completion of the Human Genome Project was a giant leap toward understanding how the human body works. Nonetheless, the human genome project and the many genomic studies following it have revealed that sequencing genomes alone does not reveal how we function. While our genes lay the blueprints for our biology, genes aren’t what carry out tasks in our bodies. That’s left to the proteins genes encode. Proteins form key components of cellular membranes, build muscles, catalyze chemical reactions in our bodies, and far more. Understanding the makeup of the human proteome, and what functions proteins have, is a fundamental but still incomplete part of studying human biology and disease.
There are roughly 20,000 protein-coding genes in the genome. Those gene-encoded proteins can be modified in many ways, however, creating variants called proteoforms that can have unique functions. Estimates put the number of human proteoforms in the millions, and each may play a different role in human biology. The Human Proteome Project ultimately hopes to find all of them.
A better understanding of the human proteome and all the proteins it comprises could lead to important breakthroughs for human health. Researchers are already using proteomics, often in combination with genomics, transcriptomics, and other methods to study diseases like Alzheimer’s and gain new insights into various kinds of cancer.
For example, proteomics research in cancer is often focused on finding new protein biomarkers associated with specific types of cancer, which can help doctors identify cancers earlier, implement targeted treatments, and monitor if those treatments are working. Finding and understanding new proteins and protein variants is key to this work. For instance, a 2021 paper in Cancer Cell linked changes in protein levels and phosphorylation status (a kind of post-translational modification) to different subtypes of head and neck squamous cell carcinoma. HNSCC accounts for about four percent of all U.S. cancer cases, and it comes in different types that can drastically change a prognosis. With new biomarkers for HSNCC subtypes, doctors can devise more effective treatment plans for patients, potentially saving lives.
Goals of the Human Proteome Project
The Human Proteome Project builds on earlier work by the Human Proteome Organization looking at different parts of the human proteome like the liver proteome and blood plasma proteome. The goals of the Human Proteome Project include:
- Identifying every protein made by the human body
- Identify proteins that have not been found but are predicted to be encoded by genes
- Make conclusive identifications of proteins for which there is evidence, but which lack identification by gold-standard mass spectrometry proteomics
- Studying the structure and function of human proteins
- Including discovering how these proteins interact with each other
- Finding major variants and post-translational modifications for each protein encoded in the human genome
In 2010, scientists in the Human Proteome Organization launched a comprehensive project to identify the entire human proteome. They split the Human Proteome Project into two main groups:
- The Chromosome-Centric Human Proteome Project (C-HPP) will “identify at least one representative protein encoded by each protein-encoding gene,” as well as major protein variants for each, including splice variants and post-translational modifications. The project is broken into 25 teams, one for each human chromosome, and one focused on mitochondrial DNA.
- The Biology/Disease-driven Human Proteome Project (B/D-HPP) will focus on proteins most important to disease research, and, importantly, elucidate how those proteins function in the human body, and how they interact with other proteins.
Both sides of the Human Proteome Project rely heavily on mass spectrometry for their work, though in the future they could also use next-generation proteomics technologies that are more accessible and comprehensive.
The future of studying the human proteome
The Human Proteome Project has already led to exciting advances in proteomics, and is laying a path for many more discoveries. One ongoing area of research seeks to understand not just the proteins encoded in the human genome, but also all of the variations to those proteins. The neXtProt database, an outgrowth of the Human Proteome Project that compiles information on the human proteome, currently lists over 192,000 human protein post-translational modifications, and almost 10 million single amino acid variations. That list will likely grow greatly in the next few years.
The information contained in the Human Proteome Project will grow only more valuable as we learn more about the diversity and function of the human proteome. This data will yield information on disease mechanisms, point toward potential treatments, and bolster our understanding of human biology.
With next-generation proteomics tools like the Nautilus Proteome Analysis Platform, we’ll hopefully make great strides toward fully understanding the human proteome, and see each and every protein that makes us human.
MORE ARTICLES