Evolution Icon Evolution
Intelligent Design Icon Intelligent Design
Medicine Icon Medicine

Disease-Associated “Junk” DNA Is Evidence of Function

Recently I did an interview with a YouTube show called “Current Topics in Science” on the topic of junk DNA. In prepping for the interview, I read about recent studies that have uncovered new functions for junk DNA. I’ll discuss those and more in just a moment, but this raises the question: How do we determine when a segment of non-coding “junk” DNA has function? There are many possible ways, but one potential method is to find genetic diseases that are associated with particular mutations in non-coding sections of DNA. Normally these non-protein-coding sections of DNA perform some important function, but when mutations occur in that DNA and disrupt the function, some disease or illness occurs. 

Sounds simple enough, right? Well, I have heard it argued that finding that junk DNA is related to the cause of some disease is not evidence that it is functional. But this argument is easily shown to be misguided. Junk DNA is often like stagehands at a play or the technical crew at a concert: When they’re working properly and doing their jobs, you don’t even think about the fact that they are there. In fact, there’s a decent chance you won’t notice them until they are screwing up. 

How Might This Apply in Biology?

One of the most common functions of non-coding / junk DNA is to regulate gene expression. When a given non-coding / junk element is working properly, it regulates the gene it’s supposed to be regulating; it does its job, and nobody notices it. But what happens when a mutation arises in that junk DNA genetic element? Well, one common result is that it will affect expression of the gene, and the gene will no longer be regulated in the manner that it’s supposed to be regulated. This then leads to a disease. The disease associated with some junk DNA element is evidence of some useful function gone wrong, not no function at all. Let’s consider a few examples.

Junk DNA Associated with Gut Diseases and Cancer

One study I discussed on the podcast was reported by a recent article from the BBC, “The ‘gene deserts’ unravelling the mysteries of disease.” The article reports that “Mutations in these regions of so-called ‘junk’ DNA are increasingly being linked to a range of diseases, from Crohn’s to cancer.” It explains how a non-coding DNA region called chr21q22 is associated with disease — but that’s only because it contains an important gene enhancer that is sometimes deleteriously affected when mutations arise in this “junk” region:

Recently, Lee and colleagues at the Crick Institute published a new investigation into a particular gene desert known as chr21q22. Geneticists have known about this gene desert for more than a decade, because it is associated with at least five different inflammatory diseases from inflammatory bowel disease (IBD) to a form of spinal arthritis known as ankylosing spondylitis. Yet deciphering its function has always proven elusive.

However, for the first time, the Crick scientists were able to show that chr21q22 contains an enhancer, a segment of DNA which can regulate nearby or distant genes, capable of cranking up the amount of proteins they make. Lee refers to this behaviour as “a volume dial”. Delving deeper, they found that this enhancer is only active in white blood cells called macrophages where it can ramp up the activity of a previously little-known gene called ETS2.

While macrophages play a vital role in clearing dead cells or fighting off harmful micro-organisms, when the body produces too many they can wreak havoc in inflammatory or autoimmune diseases, flooding into affected tissues and secreting damaging chemicals which attack them. The new study demonstrated that when ETS2 is boosted in macrophages, it heightens virtually all their inflammatory functions.

An Enhancer Function

This junk DNA element contains an enhancer that helps regulate the gene ETS2, but mutations in that region deleteriously affect gene regulation, potentially causing various diseases. The technical paper in Nature notes that deletion of this enhancer directedly affected expression of the gene ETS2:

Deletion of the chr21q22 enhancer did not affect BRWD1 or PSMG1expression, but the upregulation of ETS2 was profoundly reduced, confirming that this pleiotropic locus contains a distal ETS2 enhancer.

The BBC article further suggests that another “gene desert” (i.e., non-coding or junk DNA region) might have connections to various forms of cancer:

Scientists also predict that studying gene deserts will yield vital information which will help to improve our understanding of the various pathways involved in tumour development.

As an example, cancer researchers have pinpointed a gene desert called 8q24.21 which is known to contribute to cervical cancer as the human papilloma virus, the main cause of the disease, embeds itself in this part of the genome. In doing so, the virus enhances a gene called Myc which is a well-known driver of cancer. Studies are suggesting that the connection between 8q24.21 and Myc may also play a role in a number of ovarian, breast, prostate and colorectal cancers.

Richard Houlston, of the Institute of Cancer Research in London, says that various genetic variants which have been identified as contributing to the heritable risk of many common cancers have been found in gene deserts. Knowledge of these target genes will provide opportunities for drug discovery as well as for cancer prevention.

Another example cited in the article notes that a “gene desert” normally has a gene regulatory function, but mutations in that region caused genetic disease:

Other research centres around the world such as the University of Basel in Switzerland are also examining how single inherited mutations in gene deserts could lead to some rare genetic diseases. Three years ago, Basel scientists discovered how one of these mutations could lead to babies being born with limb malformation due to its regulatory effects on a nearby gene.

In each of these cases the story is basically the same: Junk DNA regulates gene expression, and mutations in that junk change gene expression in a deleterious manner, causing disease.

Mental Illness and Gut Diseases

Many similar examples can be given. According to a 2023 study from Stanford University, “junk DNA” regulates expression of genes that have known associations with autism and schizophrenia. The article notes:

While the coding genes provide blueprints for building proteins, which direct most of the body’s functions, some of the noncoding sections of the genome, including regions previously dismissed as “junk,” seem to turn up or down the expression of those genes.

But when expression goes wrong, diseases occur, including “including autism, schizophrenia, cancer and Crohn’s disease.” A study from King’s College London comes to similar conclusions. A press release about the research starts by noting:

About eight percent of our genome is made up of sequences called Human Endogenous Retroviruses (HERVs), which are products of ancient viral infections that occurred hundreds of thousands of years ago. Until recently, it was assumed that these ‘fossil viruses’ were simply junk DNA, with no important function in the body. However, due to advances in genomics research, scientists have now discovered where in our DNA these fossil viruses are located, enabling us to better understand when they are expressed and what functions they may have.

A senior author on the study is quoted as saying:

Our results suggest that these viral sequences probably play a more important role in the human brain than originally thought, with specific HERV expression profiles being associated with an increased susceptibility for some psychiatric disorders.

A headline in The Telegraph about the research says, “Mental illnesses linked to ‘junk DNA’ embedded with viruses inherited from our ancestors.” As the article reports, changes in expression of the HERV affected brain function:

The study analysed data from large genetic studies involving tens of thousands of people — both with and without mental health conditions — as well as information from autopsy brain samples from 800 individuals.

Researchers found that in people who were genetically susceptible to psychiatric disorders, parts of the ancient virus DNA were being either ramped up or dialled down which may affect brain function.

In other words, it looks like these ERVs have some kind of function in the brain, but when their normal expression is disrupted, psychiatric diseases result. This model is spelled out in the first paragraph of the technical paper:

Psychiatric disorders such as schizophrenia, bipolar disorder, major depressive disorder, attention deficit hyperactivity disorder, and autism spectrum conditions have a substantial genetic component. Genome-wide association studies (GWAS) have highlighted a polygenic architecture underlying susceptibility to these conditions, meaning that many loci across the genome incrementally contribute to risk. As associated variants are mostly non-coding and therefore assumed to impact the regulation of local genes, transcriptome-wide association studies (TWAS) were developed to aid the identification of gene expression signatures associated with susceptibility.

However, the paper notes that studies have “largely overlooked the expression of repetitive elements like human endogenous retroviruses (HERVs), in relation to susceptibility,” as is often the case with junk. Yet the paper reports these HERVs are known to have a variety of gene regulatory functions:

[T]hey have been hypothesised to regulate neighbouring genes, as most HERV sequences comprise of solitary viral promoters known as long terminal repeats (LTRs). However, many sequences additionally contain remnants of viral genes (e.g., gag, pol, env) that may encode additional biological functions, other than just regulating gene expression locally. For example, HERVs from the families W and FRD encoding env play a fundamental role in cellular fusion during the formation of the placenta and are now annotated as the syncytin-1 and syncytin-2 genes, respectively.

Their study found that the evidence that HERV expression is associated with genes is also evidence of their involved in “co-expression networks linking the expression of canonical genes with HERVs”:

[W]e use a TWAS approach that considers neurological HERV expression estimated to precise genomic locations, to identify expression signatures associated with psychiatric conditions, while circumventing the limitations more prevalent in traditional case-control studies. Due to the inclusion of global HERV expression, or the ‘retrotranscriptome’, in this analysis, we call this approach a ‘retrotranscriptome-wide association study’ (rTWAS). We identify extensive HERV expression and regulation in the adult cortex, including in association with genetic risk for psychiatric disorders. We also detect co-expression networks linking the expression of canonical genes with HERVs, allowing us to broadly infer the function some specific HERVs may play in neurobiology. This work provides a rationale for exploring neurological HERV expression in complex neuropsychiatric traits.

Closely Associated with Schizophrenia

They found that various expression signatures of these HERVs were closely associated with schizophrenia. In other words, the HERVs investigated in this study likely have gene regulatory functions, but when those functions go awry, genetic diseases result. The paper concludes: 

[O]ur main analysis found that 1238 HERVs expressed in the brain are regulated in cis, some of which in association with risk for complex psychiatric traits … The existence of canonical transcripts containing unique HERV sequences that confer increased susceptibility to a psychiatric disorder, however, highlights the importance HERVs played in the diversification and evolution of gene expression in the human genome, as well as their contribution to susceptibility to complex disorders.

A similar result was reported in a recent paper in Nature Genetics which found that LINE-1 (L1) elements “contact and activate genes essential for zygotic genome activation (ZGA)” and that “L1 knockdown impairs ZGA, leading to developmental arrest in mouse embryos.”

In other words, once again, the “junk DNA” is functional. That is reflected by the fact that when you mess with the “junk,” the result is problems. That doesn’t sound like junk DNA to me.