It was a great pleasure to find out yesterday morning that Frances H. Arnold, professor of chemical engineering at Caltech, has won the Nobel Prize, together with two other protein engineers, George P. Smith, an emeritus professor of biological sciences at the University of Missouri, and Gregory P. Winter, a biochemist at the M.R.C. Laboratory of Molecular Biology in England. Doug Axe and Matti Leisola have already commented on this news.
Arnold’s lab at Caltech has engineered proteins useful to industry, medicine, and science, pioneering new techniques and learning a good deal about proteins, particularly enzymes (proteins that carry out chemical reactions), along the way. One of the lessons she learned early in her career is that we cannot predict what effect particular changes to an enzyme’s amino acid sequence will have on its structure and function. Scientists tried to shift an enzyme’s function or properties one way or the other by changing a few amino acids, or even whole segments of the enzyme, without much success. The process, known as rational design, was originally thought to have great promise, but has been replaced to a large extent by a different kind of engineering process known as directed evolution.
“Directed Evolution”: An Oxymoron
The two words, “directed” and “evolution,” are together an oxymoron, but they are coupled for a reason. First, scientists actively design the strategy for mutagenesis: the starting point, the kinds of mutagenesis and selection protocols to be used, and the means to identify improved versions. This is what “directed” refers to. The word “evolution” comes in because, rather than screening one mutation at a time, protein engineers use random mutagenesis to generate millions of enzyme variants. They throw a whole barrage of mutagenesis techniques at their enzymatic candidate for change: changing single bases or groups of bases, shuffling segments of DNA, or a combination of both. The results are then checked for the slightest increase in the desired activity or improved function, and the improved variants are put through the process again. If the scientists have a high enough throughput screen, they can look at billions.
In effect, protein engineers are using the power of random change plus intelligent design to see what if anything will improve function. For enzymes with a low level of the desired activity already, the process may be relatively simple. For completely new functions the process may be much more involved. In any case, it is rare to find that a single base change is sufficient. Often multiple mutations are necessary, 7, 8, 10, 12 at a time. In some cases, some mutations are necessary for others to have their effect, in such a way that if they do not co-occur there is no benefit; or worse, there is a decrease in function.
One of the interesting outcomes of this method of engineering proteins is that effective mutations can be far apart, on the surface or buried deep, and in unexpected places. Only in hindsight can a rational design be seen. The reason for the failure of rational design as a strategy becomes apparent.
We have to remember that even a moderate sized protein of 300 amino acids is encoded by 900 base pairs. In order to have a fair chance of hitting every single base change it is necessary to try tens of thousands of mutants. To check every two base combination requires tens of millions of mutants. We cannot sample such a large combinatorial space for all three or four base combinations, even with directed, random mutagenesis, even with the best strategies, unless the number of mutants that can be rapidly tested is orders of magnitude higher.
Intelligently Designed Engineering
But there is a further problem. Because multiple mutations may be required together, Romero and Arnold acknowledge in a 2009 review, “Some functions…simply cannot be reached through a series of small uphill steps and instead require longer jumps that include mutations that would be neutral or even deleterious when made individually.”1 You can wait a very long time for four or five specific mutations to come along that give the desired function. In a bacterial population such as E. coli globally, the waiting time for four mutations can be 10^15 years.2 (The universe is only 10^14 years old.)
Lastly, as Dr. Axe pointed out in his comments, modified enzymes are poor, weak things compared to natural enzymes, even with the best of protein engineers’ efforts. We are forcing enzymes to perform in ways they were not designed to perform.
Dr. Arnold is to be congratulated. She and her lab have found a way to accomplish what nature cannot, the generation of enzymes with new capabilities or characteristics, by leapfrogging over the combinatorial barriers of time and chance, using a combination of brute force random sampling of sequence space, intelligently designed experiments, ingenuity, and perseverance. This is not evolution. It’s intelligently designed engineering.
- Romero and Arnold (2009) Exploring Protein Fitness Landscapes by Directed Evolution. Nature Reviews: Molecular Cell Biology 10:866-875.
- Reeves MA, Gauger AK, Axe DD (2014) Enzyme families — Shared evolutionary history or shared design? A study of the GABA-aminotransferase family. BIO-Complexity 2014 (4):1−16.