Evolution Icon Evolution
Intelligent Design Icon Intelligent Design

Peer-Reviewed Paper Reviews Ten “Anomalies” that Contradict the Junk DNA Paradigm

Image credit: lisichik, via Pixabay.

As I noted last week, there’s a new paper in BioEssays by Australian molecular biologist John Mattick who uses the language of historian of science Thomas Kuhn to predict that we are witnessing a “paradigm shift” away from the concept of junk DNA. After recounting how evolutionary thinking played a major role in establishing junk DNA’s prominence in biology, Mattick then reviews various “anomalies” that have challenged this paradigm. He cites ten total lines of evidence — “anomalies” in the junk DNA paradigm — that were incompatible with the idea that only protein-coding DNA matters in organisms.

Ten Failures of the Junk DNA Paradigm

  1. The “C-value paradox” is an observation that some types of organisms have unexpectedly high amounts of DNA. Some interpreted this as evidence that much of their genomes were junk, but others suspected that what it really showed is that genomes are more than just protein-coding DNA, and this non-protein-coding DNA could be there for a reason.
  2. In the late 1960s, thanks to the work of Nobel Prize-winning biologist Barbara McClintock, it was discovered that “animal and plant genomes harbour large and variable numbers of repetitive sequences” — often called transposable elements (TEs) These too were viewed as just a type of genetic garbage that accumulated over long periods of time. As Mattick puts it, repetitive elements were “interpreted as the non-functional remnants of parasitic (‘selfish’) transposons and retroviruses, despite McClintock’s demonstrations and protestations that they are ‘controlling elements’.” Mattick notes that it’s now known that TEs are vital parts of gene regulatory circuits, although he points out that “they are still commonly and erroneously invoked as indices of neutral evolution.” An article in Scientific American observes that this view that repetitive DNA is junk has actually held back our knowledge of their functions: “Although very catchy, the term ‘junk DNA’ repelled mainstream researchers from studying noncoding genetic material for many years. After all, who would like to dig through genomic garbage? Thankfully, though, there are some clochards who, at the risk of being ridiculed, explore unpopular territories. And it is because of them that in the early 1990s, the view of junk DNA, especially repetitive elements, began to change. In fact, more and more biologists now regard repetitive elements as genomic treasures. It appears that these transposable elements are not useless DNA. Instead, they interact with the surrounding genomic environment and increase the ability of the organism to evolve by serving as hot spots for genetic recombination and by providing new and important signals for regulating gene expression. … These and countless other examples demonstrate that repetitive elements are hardly ‘junk’ but rather are important, integral components of eukaryotic genomes.”
  3. Another unexpected discovery was that protein-coding sections of genes, called exons, are often broken up by non-coding sections called “introns.” Mattick observes that, “Introns were dismissed as the leftovers of early evolution colonised by transposable elements, and proffered as another manifestation of and further evidence for ‘junk DNA’.” This isn’t the first time Mattick has sounded the alarm about dismissing introns. In a 2004 article in Scientific American, “The unseen genome: gems among the junk,” he was quoted as warning that “The failure to recognize” the importance of introns could go down as “one of the biggest mistakes in the history of molecular biology.” That was almost two decades ago. As Mattick’s new paper recounts, it’s now widely known that RNAs produced by introns are vital for splicing exons together to form different variants of proteins. They thus have a major impact upon what mRNAs are translated at the ribosome. 
  4. Another crucial discovery is that much non-protein-coding DNA encodes enhancers which serve as transcription factor binding sites (necessary for transcribing protein-coding sections of DNA into mRNA) and also express long non-coding RNAs (lncRNAs). Mattick notes, “There are hundreds of thousands of enhancers in the human genome.”
  5. Mattick cites transvection as another failure of the junk DNA paradigm, because under this process a noncoding regulatory element for a particular allele can influence regulatory elements of other alleles, possibly via the production of lncRNAs. 
  6. Epigenetic processes are often controlled by RNAs produced by non-protein-coding DNA elements. Mattick notes that “transcriptional and post-transcriptional gene silencing” involves the production of small RNAs which can influence epigenetic tagging of gene (e.g., methylation) to turn genes “off.” 
  7. The “g-value enigma” refers to the discovery that the number of protein-coding genes in an organism does not always correlate with the “developmental complexity” of the organism. This suggests that molecules other than proteins (i.e., RNAs produced by non-protein-coding DNA) might also be very important to organismal development.
  8. Pervasive transcription of plant and animal genomes, as discovered by ENCODE, is another important line of evidence that much DNA beyond just the protein-coding DNA is functional. Mattick cites “the unexpected finding that animal and plant genomes are pervasively transcribed to produce not only pre-mRNAs but also large numbers of long ‘noncoding’ RNAs (lncRNAs) derived intronically, ‘intergenically’, overlapping and antisense with respect to protein-coding genes.” He notes that these lncRNAs are frequently cell-type specific, and “play major roles in cell biology, developmental biology, brain function and diseases, including cancer.”
  9. Another major discovery is the what Mattick calls the “epigenetic code,” which is found in DNA methylation and histone modification. This is relevant because lncRNAs produced, by non-protein-coding DNA, play a major role in the code. 
  10. Another important process is “paramutation,” or “transgenerational epigenetic inheritance,” which is often driven by non-protein-coding RNAs. Mattick notes that this is closely associated with short tandem repeats (STRs), a type of repetitive non-protein-coding DNA element whose length can affect many important biological processes including “circadian rhythms, sociosexual interactions, intelligence, hormone sensitivity, cognition, personality, addiction, neuronal differentiation, brain development, and behavioural evolution.” 

A Paradigm in Crisis

Together, these lines of evidence show that the dominant view that only protein-coding DNA matters — and the rest is just genetic “junk” — is a paradigm in “crisis.” It can neither account for nor predict the widespread importance of non-protein-coding DNA. But Mattick rightly explains that even a paradigm that contradicts the evidence may persist unless a new and superior paradigm is offered in its place. In a post tomorrow I’ll discuss the new paradigm that Mattick envisions for molecular biology, junk DNA, and epigenetics.