Does Barcoding DNA Reveal a Single Human Ancestral Pair?
With its publication in the journal Human Evolution a few weeks ago, a scientific paper lit up the science blogosphere. The paper asks, “Why should mitochondria define species?” Why has it caused such debate among scientists and lay people interested in origins questions?
The paper does not seem at first glance to be particularly earthshaking. But the quake hits at the end of the abstract, where authors M.Y. Stoeckle and D.S. Thaler say:
Several convergent lines of evidence show that mitochondrial diversity in modern humans follows from sequence uniformity followed by the accumulation of largely neutral diversity during a population expansion that began approximately 100,000 years ago. A straightforward hypothesis is that the extant populations of almost all animal species have arrived at a similar result consequent to a similar process of expansion from mitochondrial uniformity within the last one to several hundred thousand years.
Here’s the Cliffs Notes version: According to the authors all our mitochondria came from a very small population about 100,000 to 200,000 years ago, perhaps as small as a population size of two, though later in the paper they qualify that number. According to Stoeckle and Thaler, the same timeframe is true for 90 percent of animal species. No wonder so many people in the theistic evolution/creation dispute got irritated or excited. Theistic evolutionists saw it as an occasion for fanning the flames of anti-evolutionary sentiment. Young earth creationists saw it as evidence for the ark.
The study has been headlined by sensationalist news outlets. It has been seized upon by creationists, and it has roused the curiosity of ordinary people.
The Daily Mail announced:
All humans may be descended from just TWO people and a catastrophic event almost wiped out ALL species 100,000 years ago, study suggests.
That’s pretty inflammatory. Michael Marshall at Forbes was more skeptical:
No, Humans Are Probably Not All Descended From A Single Couple Who Lived 200,000 Years Ago
BioLogos and Peaceful Science published pieces, with Peaceful Science more disparaging. Interestingly, the evolutionist blogs I visited had not commented,
What to Make of It?
Well, I am unwilling to dismiss the article flat out, but neither can I endorse it. I don’t think the study can claim all the things it does based on the evidence they have. That I take this approach is ironic. I myself am investigating the possibility of our origin from a single human pair, so my opinion is not because I exclude the idea a priori. Yet I must confess I have reservations. Too many unanswered questions.
I am also sympathetic because I have seen tactics used on Stoeckle and Thaler similar to those that have been used on ID proponents. Denigrating the journal the article was published in, and therefore declaring the work is junk, is erroneous as an argument, because controversial papers may not ever see the light of day except in non-conformist journals. Saying they are ignorant or worse, dishonest, without first examining the work on its own terms, is simply unfair, and ad hominem.
A furor erupted in 1987 with the publication in the journal Nature of a paper by Rebecca Louise Cann, Mark Stoneking, and Allan Charles Wilson called “Mitochondrial DNA and Human Evolution.” The authors described tracing back mitochondrial lineages to a single source, a woman “postulated to have lived about 200,000 years ago, probably in Africa.” The press took the idea and ran with it, calling the paper evidence for Eve, or mitochondrial Eve as she was dubbed. Not long after, studies done with Y chromosomes came up with estimates for the origin of Y chromosomal lineages in “Y-chromosome Adam.” Original estimates traced the Y chromosome lineage back about 90,000 years.
Several things to note:
It is notoriously difficult to date either mitochondrial genes or Y-chromosome genes. They each have different mutation rates from the rest of the genome, and they are both passed on only through one sex (female or male respectively). And when studying multiple species, the population size and generation times can influence dates. As an indication of the difficulties, the date estimates for both mitochondrial Eve and Y-chromosome Adam have varied considerably over the years.
Another major point that must not be ignored is how genetic lineages behave. If you follow lineages backward in time, they tend to coalesce. Tracing a lineage back to one individual does not mean that individual was the only person alive at the time, only that the genetic lineage coalesced, that it can’t be traced back further.
Let’s look at human lineages. When you draw a family tree, you will quickly notice how many people never produce any progeny. They die young or they never marry. Specifically, because only women pass on mitochondria (or nearly always), if a woman has no daughters, her mitochondrial lineage ceases.
Look at this figure. It traces the lineages of women going back eight generations. At the top is the ancestral generation with eleven individuals. I have colored the arrows for one woman’s lineage red. In the final generation, only her lineage persists. Working from the present backwards, you can follow the pattern as it coalesces into one lineage, that one woman’s on the first line. Note also that mitochondria, because they are passed from mother to daughter, never have an admixture of the paternal and maternal DNA. This speeds up the effects of beneficial gene sweeps.
Now, real lineages usually don’t coalesce so quickly, unless something extraordinary happened in the past. This is only a cartoon. But lineages do coalesce, always, due to stochastic processes, and for humans mitochondrial coalescence happens in 100,000-200,000 years, the same time frame that Stoeckle and Thaler propose.
Was that “original” woman in my illustration the only woman alive in her generation? In my illustration, no. Were there generations before her? We can’t tell. I didn’t include them. There could have been many. Thus, although all mitochondria trace back to one mitochondrial genome, one woman, about 200,000 years ago, that does not mean that woman was the only one alive at the time, or the first woman alive. We would need other sources of information to determine those things. For mitochondrial Eve, all we have is the coalescence of mitochondrial lineages to one. We can’t tell, by this method, based solely on this evidence, whether she was the only woman or the first woman or neither.
Now to return to Stoeckle and Thaler’s paper. Their paper has some of these same difficulties, since they are working with mitochondrial DNA. In fact, they used a technique called barcoding. A small stretch of DNA from a particular mitochondrial gene is sequenced from many individuals in many species — over five million by now. Using this technique it has been shown that the sequence variation between individuals of a species is small and clustered, whereas the sequence variation is distinct and separate between species, even closely related species. This pattern of distinctive sequence variation is what allows species identification.
Ironically, the largest part of the paper has received no attention from the media. It deals with the reliability of barcodes to detect species, and then with the species concept itself, not the controversial dates. The authors come down squarely on the side of species as recognizable, distinct biological groupings, not smearing together as Darwin might suggest, but having distinctive sequences that 90 percent of the time can be reliably grouped. This view of species as discrete entities goes against the modern grain. They say:
In a founding document of phylogeography, Avise and colleagues noted the long-standing divide in biology between the intellectual lineages of Linnaeus for whom species are discrete entities and those of Darwin who emphasize incremental change within species leading to new species . They presciently proposed that mitochondrial analysis would provide a way to bridge the intellectual gap. DNA barcoding now provides the most comprehensive database allowing a kingdom-wide and quantitative realization of that vision.
They highlight this idea in several places, by quoting well known but older sources, for example, Dobzhansky in his book Genetics and the Origin of Species (1937):
In other words, the living world is not a single array of individuals in which any two variants are connected by unbroken series of intergrades, but an array of more or less distinctly separate arrays, intermediates between which are absent or at least rare. Each array is a cluster of individuals, usually possessing some common characteristics and gravitating to a definite modal point in their variation.… Therefore the biological classification is simultaneously a man-made system of pigeonholes devised for the pragmatic purpose of recording observations in a convenient manner and an acknowledgement of the fact of organic discontinuity.
And then Ernst Mayr (1942, quoted in Provine, W.B., Ernst Mayr: Genetics and speciation. Genetics, 2004. 167(3): pp. 1041-6):
The reduced variability of small populations is not always due to accidental gene loss, but sometimes to the fact that the entire population was started by a single pair or by a single fertilized female. These “founders” of the population carried with them only a very small proportion of the variability of the parent population. This “founder” principle sometimes explains even the uniformity of rather large populations, particularly if they are well isolated and near the borders of the range of the species.
These ideas have fallen out of favor, but perhaps deserve revisiting. The idea of speciation as the result of isolation of small populations, with fixation of limited diversity as a result, is something barcodes have revivified. The pattern seen in barcodes is evidence that must be accounted for.
Stoeckle and Thaler discuss several models for how speciation might occur (lineage sorting, selective gene sweeps, and bottlenecks), coming squarely down in favor of bottlenecks as the quickest way to get genetic uniformity to occur. Bottlenecks are the best explanation for the barcode patterns they observe, they think.
The clustering of barcode variation does not blur between species, as one would expect based on the assumption of neutral mutation and genetic drift. The authors propose that the reason for the clustering within species and separation between species is a bottleneck event (a population crash) for all species, including humans, that essentially reset the mitochondrial genomes within species to uniformity, and between species to distinctiveness. A bottleneck, or sudden population reduction, can be thought of as a sudden coalescence to one or a few lineages within a species — only the survivors’ genetic lineages persist. This also increases the variability between species, by pruning away any overlap. These populations then grow and accumulate new mutations over time. It is “new” mutations that Stoeckle and Thaler use to estimate the 100,000-200,000 year old date.
The single most serious problem they face is how to interpret the clustering of barcode sequences within species — slow coalescence (lineage sorting) or fast coalescence (a bottleneck). Mitochondrial lineages coalesced at about 100,000-200,000 years ago, as shown by mitochondrial Eve. That is the same time frame that Stoeckle and Thaler propose for their population crashes. But as we have seen, coalescence, or lineage sorting as they call it, does not require a catastrophic event or a new beginning. It could be slow lineage sorting, what I’ve called coalescence, or fast lineage sorting, what they call a bottleneck.
How to determine this? It turns out the coalescence rate is inversely proportional to the effective population size. Yet 90 percent of all species have the same amount of clustered sequence difference. This is a genuine conundrum that needs to be explained. Either a sudden bottleneck effectively sped up coalescence, or the effective population size for all those species was constant and of similar size. The authors analyzed very different kinds of species, though, making equal effective population size highly unlikely. They had to choose between two improbable situations: universal effective population size or a universal bottleneck — both unlikely equalities. They chose the bottleneck as the less improbable.
Is there independent evidence from fossils or geology to indicate a bottleneck? They discuss briefly whether or not fossils show sudden massive extinctions at that time period with no clear conclusion. Another question asked is also touched on: have they examined other mitochondrial genes? Or do we see the same pattern of clustered variation in nuclear genes? As far as I know there is not enough genomic sequence from any species other than human to tell.
Most of the paper is spent defending the reliability of barcoding and the distinctiveness of species, for reasons that should be clear now. It is from these things they derive support for the concept of speciation by bottleneck. From these two things comes their estimate of an origin for Homo sapiens in a bottleneck 100,000-200,000 years ago. That’s plenty of controversy to take on. They acknowledge this in their last sentence.
This vista of evolution is best seen from the passenger seat.
It seems to me they have elected to drive.
Photo credit: Christiaan Colen, via Flickr (cropped).