The failure to recognize the importance of introns “may well go down as one of the biggest mistakes in the history of molecular biology.” –John Mattick, Molecular biologist, University of Queensland, quoted in Scientific American
On Friday, May 14, I watched as Steve Meyer faced his critics — two of them anyway, Art Hunt and Steve Matheson — at Biola University in Los Angeles. Matheson had previously claimed that Meyer misrepresented introns in his book, Signature in the Cell. (Introns are non-protein-coding sequences of DNA that occur within protein-coding regions.) In a blog post dated February 14, Matheson had accused Meyer of “some combination of ignorance, sloth, and duplicity” for stating in his book that although introns do not encode proteins they nevertheless “play many important functional roles in the cell.”
Calling Meyer’s statement “ludicrous,” Matheson wrote on his blog that biologists have identified functional roles for only “a handful” of the 190,000 or so introns in the human genome:
How many? Oh, probably a dozen, but let’s be really generous. Let’s say that a hundred introns in the human genome are known to have “important functional roles.” Oh fine, let’s make it a thousand. Well, guys, that leaves at least 189,000 introns without function.
Matheson added that “there are more layers of duplicity in the ‘junk DNA’ fairy tale than Meyer has included in his book,” which (Matheson concluded) uses science to advance an agenda in which “rigorous scientific truth-telling is secondary.”
Naturally, I expected Matheson to bring up this devastating criticism at the Biola event on May 14. But he said nothing about Meyer’s “ludicrous” notions of intron functions that evening, and he was mum about all the other layers of duplicity that he claims to be privy to. This was probably wise, because Matheson is wrong about intron functionality.
The segments of our DNA that are commonly called “genes” consist of protein-coding exons and non-protein-coding introns. Initially, the entire DNA segment is transcribed into RNA, but between ninety and ninety-five percent of the initial RNAs are “alternatively spliced.”
What is alternative splicing? Imagine that the initial RNA derived from its DNA template has the organization A–B–C–D–E–F, where the letters represent blocks that specify amino acid sequences and the dashes in between the letters stand for introns. Alternative splicing enables multiple proteins to be constructed given the same RNA precursor, say, ABCDF, ACDEF, BCDEF, and so forth. In this way, hundreds or thousands of proteins can be derived from a single gene.
There’s more. The messenger RNAs that are produced by this process — and therefore the proteins that are made in a cell — are generated in a way that depends on the stage of development as well as the cell and tissue type. In the above example, a nerve cell may express the ACDEF version of a messenger RNA whereas a pancreatic cell may produce only the BCDE version. The differences are biologically essential.
What does this have to with introns? Everything. It is the presence of introns that makes this permutative expansion of messenger RNAs possible in the first place.
So let’s do the math. At least ninety percent of gene transcripts undergo alternative splicing, and there are at least 190,000 introns in the human genome. That means we have at least 0.90 x 190,000 = 171,000 introns that participate in the alternative-splicing pathway(s) available to a cell.
Someone could argue that the sequences directing alternative splicing are in the protein-coding regions of the RNA. That is, one could argue that while introns do indeed make splicing possible, they are merely “junk” fillers, with the exons indicating where and how the spliceosome is to do its cutting and pasting. Yet such an argument would be false. In order for alternative splicing to work properly, it is necessary not only that exons be demarcated from introns, but also that the splicing process be correctly modulated. And introns contribute significantly to such modulation.
How do I know this? Take a look at this figure:
This is the tabulated distribution of alternative splicing code motifs within three generalized exons (the white, purple, and orange blocks on the top) and two introns (the thin broken, black/light blue and dark blue/olive brown lines). This figure was taken from a 2010 article in Nature about alternative-splicing regulation in mice. At the top of each column, four letters indicate the tissues from which the data were derived: C = central nervous system; M = muscle; E = embryo; and D = digestive system. It is clear that introns are just as rich in splicing-factor recognition sites as are exons. The authors of the article — titled “Deciphering the splicing code” — conclude that the evidence “predicts regulatory elements that are deeper into introns than previously appreciated.”
Using the mouse as a surrogate, we can infer that the roughly 171,000 human introns involved in alternative splicing probably have a similar distribution of formatting codes, which are necessary to ensure that the proper proteins are made at the correct developmental stage and in the appropriate cells and tissues. Even if we were off by a factor of two, we would still be left with 85,500 introns that function in the process of alternative splicing.
This is not the only evidence that Matheson ignored. For example, non-translated microRNAs regulate the developmental expression of messenger RNAs, and small nucleolar RNAs are essential for the processing of ribosomal RNAs (which in turn are essential for protein production). The human genome contains 1,664 known genes for the former and 717 known genes for the latter, and the majority of these genes occur in introns.
Then there are the regulatory codes associated with such RNA genes, which also occur in introns. And RNAs that emanate from introns but that are not part of messenger RNAs; 78,147 of them are known to exist in humans. Even if only ten percent of the latter RNAs play some role in cellular organization, we have far more than “a handful” of functional introns in this category alone.
And there’s still more. RNA is essential for chromatin organization in the nucleus. When chromatin-associated RNA is degraded by experimental means, the geometry of chromosomes and nuclear metabolism is adversely affected. Yet a recent study of this class of RNAs in human cells revealed that over half of the transcripts (52%) are derived from… introns!
I could go on. Various DNA control modules have been mapped to introns, including alternative promoters, enhancers, silencers, and nuclear matrix attachment sites–some of which influence genes that are located over a million basepairs away on the chromosome. But sorting through all the studies that have been published on this subject would be a big job.
A job, obviously, that Matheson has not done — though whether through ignorance, sloth, or duplicity I cannot say.