Evolution Icon Evolution
Intelligent Design Icon Intelligent Design

Roundup of Functions for “Junk DNA” Supports the New RNA Gene Paradigm

Photo credit: THAVIS 3D via Unsplash.

In recent days I have covered a new paper published in BioEssays by Australian molecular biologist John Mattick who cites the need for a “paradigm shift” in biology away from the concept of junk DNA (see herehere, and here). I recounted how this shift is being driven by the discovery of many “anomalies” in the junk DNA paradigm which show that non-protein-coding DNA is functional. And I explained that a new paradigm which recognizes the existence of “RNA genes” is starting to replace the old junk DNA paradigm. But Mattick’s paper is not the only place where he has explained the importance of RNA genes. In a paper published earlier this year in Nature Reviews Cell and Molecular Biology, “Long non-coding RNAs: definitions, functions, challenges and recommendations,” Mattick and a team of over 25 co-authors explain the rich functions of long non-coding RNAs, or “lncRNAs”:

Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and development and other physiological processes. Many lncRNAs associate with chromatin-modifying complexes, are transcribed from enhancers and nucleate phase separation of nuclear condensates and domains, indicating an intimate link between lncRNA expression and the spatial control of gene expression during development. lncRNAs also have important roles in the cytoplasm and beyond, including in the regulation of translation, metabolism and signalling. lncRNAs often have a modular structure and are rich in repeats, which are increasingly being shown to be relevant to their function.

[…]

RNAs participate in virtually all levels of genome organization, cell structure and gene expression, through RNA–RNA, RNA–DNA and RNA–protein interactions, often involving repeat elements, including small interspersed nuclear elements in 3′ untranslated regions. These interactions are involved in the regulation of chromatin architecture and transcription (see later), splicing (especially by antisense lncRNAs), protein translation and localization, and other forms of RNA processing, editing, localization and stability.

Many lncRNAs are involved in the regulation of cell differentiation and development in animals and plants. They also have roles in physiological processes such as (in mammals) the p53-mediated response to DNA damage, V(D)J recombination and class switch recombination in immune cells, cytokine expression, endotoxic shock, inflammation and neuropathic pain, cholesterol biosynthesis and homeostasis, growth hormone and prolactin production, glucose metabolism, cellular signal transduction and transport pathways, synapse function and learning, and have roles in the response to various biotic and abiotic stresses in plants. There is also an emerging association of lncRNAs with the cell membrane and with ribozymes.

A Flawed View of Function

A common argument for non-functionality in “junk DNA” is that it is not conserved. It shows different sequences across different species — which supposedly means that it readily tolerates neutral mutations because there’s no function for mutations to damage and selection to preserve. This is an evolutionary argument, of course. It assumes that the reason a DNA sequence exists in the first place is that it was generated by mutation and selection. Thus, variability across species indicates a lack of purifying selection. But what if DNA did not arise through mutation and selection? Of course, Mattick et al. (2023) don’t challenge evolution, but they do highlight the fact that these “non-conserved” sequences can be highly functional — and may even play crucial roles in specifying the differences among species: 

Most lncRNAs are less conserved among species than the mRNA sequences encoding the proteome. Initially, most of the mammalian genome (which included most lncRNA loci) was thought to be evolving neutrally, using the yardstick of the rate of divergence of common ‘ancient repeats’ (derived from transposons) between the human and mouse genomes, on the assumption that these sequences are nonfunctional and representative of the original distribution in the ancestor. However, there is increasing evidence that transposable elements are widely co-opted as functional elements of gene expression and structure, forming promoters, regulatory networks, exons and splice junctions in protein-coding genes and lncRNAs, and therefore cannot be used as indices of neutral evolution.

Regulatory sequences, including promoters and lncRNAs, are known to evolve rapidly due to more relaxed structure–function constraints than protein-coding sequences and due to positive selection during adaptive radiation. Many lncRNAs are cell lineage specific. Indeed, given their association with developmental enhancers (see later), variation in the complement and sequences of lncRNAs may be a major factor in species diversity.

Don’t miss the last line there: “variation in the complement and sequences of lncRNAs may be a major factor in species diversity.” They are saying that differences between non-coding DNA across species don’t mean that this DNA is “junk.” Quite the contrary, these differences indicate that the “junk” may be responsible for encoding the differences between species. In other words, the junk DNA paradigm may have caused us to miss the precise DNA that helps makes a species unique. 

A Proliferation of Recently Discovered Functions

Obviously, these are very general functions they are citing. But multiple recent papers published just over the past few years have found many specific functions for non-coding “junk” DNA:

Exactly how many functional lncRNAs are known? A 2021 article in Nature by Gates et al. noted that over 130,000 functional “genomic elements, previously called junk DNA” have now been discovered. Mattick et al. review evidence that affirms this number: 

Well in excess of 100,000 human lncRNAs have been recorded, many of which are specific to the primate lineage. This is a vastly incomplete list due to the limited analysis of different cells at different developmental stages (see later). There are now hundreds of thousands of catalogued lncRNAs and dozens of databases (and databases of databases) with curated information. Over the past decade, there have been ~50,000 publications with ‘long non-coding RNA’ as a key term and more than 2,000 publications reporting validated lncRNA functions, although most have yet to be followed up in any detail.

To appreciate the pace at which functional non-protein-coding genetic elements are being discovered, be sure to look at this figure from Gates et al. (2021). The top graph shows an orange curve which depicts the number of known functional non-protein-coding DNA elements. As you can see, the curve is going up at what looks like an exponential rate. 

Biologists Skeptical of Function

How have biologists typically responded to this evidence of mass functionality for non-coding DNA? Mattick et al. (2023) explain they were typically skeptical and maintained their commitment to the junk paradigm:

[T]he common initial reaction of the molecular biology community was to suspect that these unusual RNAs are transcriptional noise, because of their generally low levels of sequence conservation, low levels of expression and low visibility in genetic screens. Since then, however, there has been an explosion in the number of publications reporting the dynamic expression and biological functions of lncRNAs, aided by extensive technology development that has enabled their identification and characterization, although only a minority of lncRNAs have confident annotations and very few have mechanistic information.

The idea of “transcriptional noise” is simply the hope that, although most of our DNA is transcribed into RNA, this RNA is itself junk. So we’ve simply shifted the argument from junk DNA to junk RNA. But the common arguments for this are weak. As we saw, low sequence conservation doesn’t necessarily indicate a lack of functionality. On the contrary, it could be encoding precisely what makes species different. Mattick et al. cite a variety of other lines of evidence suggesting that these non-protein-coding RNAs are functional: 

Loci expressing lncRNAs exhibit many of the characteristics of protein-coding genes, including promoters, multiple exons, alternative splicing, characteristic chromatin signatures, regulation by morphogens and conventional transcription factors, altered expression in cancer and other diseases, and a range of half-lives similar to those of mRNAs.

The restricted expression of lncRNAs in different cells at different stages of development and their generally low copy number (owing to their regulatory nature) accounts for their sparse representation in bulk-tissue RNA sequencing datasets, whereas many lncRNAs are relatively easy to detect in particular cells.

In other words, it can be difficult to detect the functions of these non-protein-coding RNAs because they may only be active or functional in particular cell types and only at particular stages of the life cycle. But their transcription is hardly random. Indeed, it seems carefully controlled and orchestrated, similar to how protein-coding DNA is regulated. The evolution-driven junk DNA paradigm prevented us from recognizing the importance of these RNA genes. Thankfully, however, more and more biologists like John Mattick are willing to follow the evidence where it leads. To my knowledge Mattick is not a proponent of intelligent design. But given the way that evolutionary scientists often oppose those who challenges reigning paradigms, he is to be commended for his courage.