Protein engineering: Lessons from the mother of invention

As a protein engineer, I cannot help but to feel an occasional twinge of envy towards some of my colleagues who are working with RNA scaffolds. RNA scaffolds are like the biological equivalent of Lego. They're cool because they fit together in a coordinated and intuitive way that allows  you to build cool stuff with them, albeit on a nanoscale level. As a kid, I loved Lego and the idea of building biological machinery in RNA appeals to my inner nerd in the same way that Lego bricks did when I built cars, houses and machines with them. The pieces snap together in a logical way, they're modular and if you have enough of them, what you can build would seem to be limited only by your imagination. Alas for the would-be protein engineer however, proteins are another story altogether - messy, illogical and unpredictable by comparison. If RNA assembly can be likened to building houses with bricks, protein engineering is like building machinery with marshmallows. All of this is not to say that the actual 3-dimensional structures of RNA are not complex - they are. When considering oligonucleotide sequences, there is a tendency to always think about them purely as 1-dimensional sequences that pair by complementarity. In reality, RNA in particular is capable of folding into some complex 3-dimensional shapes, the forms of which can be as biologically relevant as their sequences. Fundamentally however, there is a convenient and largely intuitive logic to the molecular interactions that determine how RNA molecules fold and associate with other RNA molecules.

There is a similar kind of natural logic underlying protein structures, but right now at least, it is just too complex for us to understand globally. We understand a lot of the small details - hydrophobic side chains tend to be buried in the interior of the protein, away from exposure to the aqueous environment - hydrogen bond donors and acceptors pair up to form local secondary structures like alpha helices and beta-sheets that can often stack against one another, and so on. And while these kinds of basic rules allow us to take a stab at predicting the final structure of a protein from its sequence, the combinatorial complexity of protein interactions and the vast state space of conformations potentially accessible to a polypeptide chain, make the true design of anything but the simplest kinds of proteins, extremely challenging.

If you need any further convincing of the complexity of protein folding, consider the Levinthal paradox. A polypeptide of 100 residues contains 99 peptide bonds, with 198 different phi and psi bond angles. Even if we assume that each of these bond angles has only three stable conformations (almost certainly an underestimate), this small protein can exist in one of 3^198 different conformations. The real problem here comes when you start to consider how even such a simple protein might fold following synthesis on the ribosome. Assuming that each of these potential conformations could be sampled by the polypeptide chain on a timescale of the order of nanoseconds or picoseconds, our simple hypothetical protein would require more than the known age of the universe to fold correctly if its folding depended upon a random walk through its conformational space.

So what does all of this mean for the would-be protein engineer?

Real protein engineering does get done in spite of this vast complexity, but the process tends to be more one of pragmatic empiricism and emulation than of pure design in the traditional sense. In one project that I worked on for a mjaor pharmaceutical company for example, we were faced with the challenge of developing a biotherapeutic based upon a human protein that appears to have been designed by nature to be unstable in the blood stream, readily unfolding and being rapidly degraded in order to naturally limit its half life and distribution in the body. Indeed, due to the intense inflammatory activity of this protein, it seems designed to act locally in the body close to its site of release, rather than systemically where it could be extremely toxic.

Looking at the structure of this protein revealed the likely mechanism by which this unfolding and degradation occurs and also pointed us to a region of the polypeptide chain that we might wish to re-engineer in order to prevent these processes from happening. And here's where the empirical approach kicked in - rather than completely redesigning this region of the protein from scratch, we looked to similar regions in other proteins, that were known to fold more stably and are less prone to proteolysis. Borrowing wholesale therefore from nature, we crafted a new region of this protein based upon a consensus design from a bunch of other proteins. As it turned out, this approach worked beautifully and our re-engineered protein exhibited all of the pharmacological properties that we had hoped for - a much longer serum half life and resistance to proteolysis, not only in vivo, but also during the manufacturing process in which we had previously lost some of our protein due to proteolysis. This molecule by the way, is currently entering human clinical trials as we speak.

This kind of learning from nature is a common theme in the life sciences, and has already spawned new applications far beyond medicine and healthcare. Perhaps the best known of these, Velcro, was inspired by the microscopic hooks on burdock seeds that improve the seeds' distribution by enabling them to stick to animal fur. From new adhesives inspired by molluscs to non-stick surfaces inspired by the leaves of insectivorous plants, there are plenty of scientists who are willing to let nature be their mentor.

Protein engineers then, are in very good company when it comes to borrowing from nature.

Coming back to the original inspiration for this piece - bioengineering with RNA scaffolds - while it would be nice to be able to apply that kind of RNA-based, Lego brick approach in my own work designing biotherapeutics, there are several factors that count against it. Naked oligonucleotides are something that the human immune system is extraordinarily vigilant towards as they are typically a sign of a viral infection and therefore, innately immunogenic. For similar reasons, the human blood stream is a very hostile environment for RNA whose serum half-life under such conditions is typically very brief. Furthermore, with regard to using RNA molecules as active drugs, it is interesting to note that nature herself never really made the leap to the widespread use of RNA machinery in the cell, beyond its use in the ribosome and the relatively few catalytic RNAs that are known to exist in the cells of higher organisms. While it seems that life might possibly have started out being much more dependent upon RNA-catalyzed reactions, it is proteins that are now responsible for orchestrating the vast majority of the biochemical reactions that make life possible.

As a protein engineer therefore, copying from nature rather than designing from scratch might be less flattering to the scientific ego, but you can certainly console yourself with the thought that you are at the very least, learning from the very best.

 © The Digital Biologist | All Rights Reserved