ENCODE, Evolution, and the Percentage of the Genome That’s Functional

Casey Luskin


ENCODE researchers got together in Potomac, MD, from June 29-July 1 for the ENCODE 2015: Research Applications and Users Meeting. For those who don’t recall, ENCODE (Encyclopedia of DNA Elements) is a years-long research project involving a consortium of hundreds of international scientists studying functionality in noncoding DNA in the human genome. Apparently some of those researchers are backing away from ENCODE’s initial claims that 80 percent of the human genome is functional, and now claim that it’s more like 50 percent functional.

Of course they don’t have direct evidence for this claim, and it’s my sense that vocal ENCODE critics like University of Houston molecular evolutionary biologist Dan Graur probably shook up the ENCODE proponents. Graur and others vocally attacked ENCODE proponents in both the scientific literature and on blogs for their “hype” and claimed that only 8-20 percent of our genome is functional. Their reasoning followed from evolution-based considerations like the percent conservation. What would ENCODE proponents say now if it weren’t for Graur and his militant colleagues? It seems to me that they may be backpedaling on the numbers to avoid incurring further wrath from evolution-defenders.

Here’s what I wonder: Do any of them appreciate that even the 50 percent of our genome that they already believe is functional refutes unguided evolution according to Graur’s infamous quip, “If ENCODE is right, then Evolution is wrong“? So whether it’s 50 percent functionality or 80 percent functionality, that’s bad for unguided evolutionary models! Why? Because if those ENCODE people are right that about 50 percent of the genome is functional, that’s still a lot higher than Graur et al.‘s percentage. It also seems that many are already prepared to reject at least some of Graur’s evolution-based arguments.

Of course these ENCODE proponents still hold to some form of an evolutionary model. Some are evidently content to live with ambiguity until evolutionary models are developed to explain ENCODE’s data. For example, lead ENCODE researcher John Stamatoyannopoulos admits that “new models of evolutionary conservation are needed” to explain why so much human DNA is functional.1 Similarly, in a Nature article titled “Celebrate the Unknowns,” Philip Ball writes about ENCODE:

[T]he current picture of how and where evolution operates, and how this shapes genomes, is something of a mess. … But we are grown-up enough to be told about the doubts, debates and discussions that are leaving the putative ‘age of the genome’ with more questions than answers. Tidying up the story bowdlerizes the science and creates straw men for its detractors. Simplistic portrayals of evolution encourage equally simplistic demolitions.”2

Aside from Ball’s admission that ENCODE leaves evolutionary genomics in “a mess,” don’t miss his last two sentences. In referring to “detractors” who demolish “simplistic portrayals of evolution,” he has advocates of intelligent design in mind. ID proponents point out that ENCODE refutes evolutionary models that predicted a junk-filled genome. Now that those predictions have failed, the best way to save evolution from ENCODE is to disavow the old models by calling them “simplistic” or “straw men.”

But Ball is being honest when he acknowledges that ENCODE’s data wreak havoc upon old evolutionary models, and that evolutionists cannot, presently, explain those results. What he doesn’t say is that the models that predicted junky genomes were not “straw men” or fringe hypotheses. They were well-accepted proposals and direct inferences from evolution-based population genetics math. Philip Ball and others now tacitly or openly admit that new evolutionary models are needed, but won’t give up on the overall evolutionary viewpoint.

If ENCODE proponents are right that 50 percent of the genome is functional, do they realize that this means Graur is wrong — and, as Graur says, this means “evolution is wrong”? Someone should ask them.

In my view, almost all of the anti-ENCODE evolutionary arguments for junk DNA end up being circular. If unguided evolution is true then the arguments are valid. But if unguided evolution isn’t how our species arose, then the arguments fail:

  • If species were designed, then we would expect many nucleotide sequences NOT to be the result of mutation and selection. This could explain why large portions of the genomes are NOT conserved but ARE functional.
  • If species were designed, then the C-value paradox could be meaningless. Maybe the reason some organisms have large genomes is because a few species have genomes that have ballooned compared to their aboriginal design — but that doesn’t mean their DNA isn’t functional, and it doesn’t say anything about whether other genomes (including ours) are full of junk.
  • If species were intentionally designed with important and diverse functional genetic elements, but weren’t intended to go on forever, then investigating the “mutational load” won’t reveal how much of the genome is functional.
  • If “transposons” — i.e., repetitive DNA, which we’ve mistakenly identified as selfish junk DNA — are actually designed to be important functional control elements in the genome, then viewing them as parasitic elements which auto-proliferate through our DNA is simply a false assumption that’s blocking us from understanding what TEs are really doing.

My guess is that it will take specific studies of specific elements before some ENCODE proponents have the guts to stand up to critics and once again affirm a figure beyond 50 percent. That will require a lot of research and a lot of time. But as I said, even the 50 percent they do admit as being functional is itself FAR beyond what evolutionary biology predicted. For proponents of unguided evolutionary models, it seems, the game is already up.

I’ll have more to say on ENCODE next week. But for the moment, I should note that for my part, I think that the percentage of our genome that is functional is probably very high, even higher than 80%. The ENCODE proponents who backed down to 50% did so supposedly (peer pressure aside) because of mysterious repetitive DNA in the genome. But we already know that there are many possible functions for repetitive DNA. Indeed, numerous papers have shown function for non-coding “junk” DNA, including repetitive DNA, and the trendline of the research overwhelmingly shows that when we look for function in non-coding DNA, we find it.

ENCODE-critics who say the genome is junky rely primarily on theory; ENCODE proponents who say the genome is functional rely primarily on data. I think it’s clear how this debate is going end up. Stay tuned for more.

