Human Genetic Variation — A Tale that Keeps on Telling
It’s always nice when your work gets noticed. Steve Schaffner, a population geneticist from the Broad Institute, published a review of the human origins problem at BioLogos, in which he generously mentioned our work. It’s a good article, cleanly written. You might want to take a look.
We proposed in our paper, “A Single-Couple Human Origin is Possible,” that in the very beginning, in the first couple, there had been a special category of genetic variation we called primordial diversity. The first couple had four distinct sets of chromosomes, each with a unique set of nucleotide differences. These “frontloaded” chromosomes were ready to provide the opportunity for something new. The frontloading was random, that is each chromosome had a distinct set of mutations.
Once things started running, two things would happen. There are four possible alleles for each position on the four copies of each chromosome. Most alleles are all the same — all A or C, for example. The primordial alleles differ on only one chromosome out of the four — instead of A on one chromosome, or T instead of C. One parent would be heterozygous for that allele (AG) and the other parent is homozygous (AA). For that position, then, 1 out of 4 alleles, with an allele frequency of 1/4 or .25, is different from the others. Most of the other positions along the chromosome are homozygous, with an allele frequency of one. As time goes on, mutations happen, and recombination occurs. The primordial alleles become more or less common over time.
Mutations accumulate but always are always rare in the beginning, 1 out of 10, or 100, etc., depending on the population size. That can be seen in the purple graph above. A mutation starts off very rare, near zero; then if it survives and increases its frequency, the purple curve drifts to the right on the horizontal axis, allele frequency. The vertical axis represents the number of alleles with a given frequency. Mutations may be rare, but they occur often, so the number of new mutations is high.
The green curve shows what would happen if the first generation of mutations drifted to equilibrium without additional mutation, but still allowing for recombination. The alleles will either increase in frequency and drift to the right, or disappear completely in the first few generations. The curve loses the left-hand peak because there are no additional mutations.
The blue curve shows what would happen to the original primordial alleles as they underwent recombination. Since every “designed variant” is unique, but there are only four chromosomes, each variant has a frequency of 0.25. In the beginning there would have been a sharp peak at 0.25 for all the primordial alleles. Over time, the frequency of designed variants would spread out until the distribution became uniform. There is no new input of designed variation just as there is none for mutation, but recombination and random mating can shift the allele frequencies up or down.
We used these graphs to make a proposal for how much primordial diversity there might have been. I will show just one figure to make the point — the logic is the same for all geographic regions.
There is about the same amount of material in the purple band that runs across the base of the graph for each geographic zone so we took that amount of variation to be the amount of primordial diversity.
Why Bring All This Up?
As I mentioned, last summer Steve Schaffner published his review of the human origins problem at BioLogos. Schaffner has a way, he thinks, to disprove our models. He thinks the genetic signal in our genome is due to common descent, not intentional design. He has noticed that the figure above from our paper indicates where we think the band of primordial diversity lies. By his calculations, roughly 50 percent of all alleles at the .25 frequency will be from that original primordial diversity and should be ancient. All the others should be young. The mutations at the .25 mark and above he calls high frequency. The mutations at the left side of the spectrum are young, recent, and low frequency.
So, he asks this question: If we came from a single couple with primordial diversity at the beginning, why does the mutational profile look the same, whether it comes from low frequency or high frequency mutations?
See here for a graphic from Schaffner’s article. The way he generates these graphs is to look at positions on the chromosome that are biallelic, and then to identify the kind of mutation that must have taken place, assuming the chimp allele at that position is ancestral. He then sums the number of each kind of mutation he found in each region of the allele frequency spectrum.
The pattern he sees is always the same across humans, apes, and all mammals he has reported. There are biochemical reasons for the particular pattern of mutations we observe — mutational biases, error correction biases, chemical transformations of particular nucleotides, and relative stability of base pairing. The main reason he shows this pattern is to indicate that mutations drive our genetic history, not primordial mutations. This analysis was done using the 1000 Genomes Project data, which of course represents a mixture of old and new mutations, and a large population size.
What he did not report was how fast primordial alleles can change their allele frequencies in a small rapidly expanding population — very quickly — and how quickly they will be affected by the kinds of mutational conversions I mentioned above. We did not model mutational bias in our paper, as it was not central to our main point — that it is mathematically possible that we could have come from a first pair. The data he used was for modern humans, which he assumes came by common descent, and did not start from two.
We actually don’t know whether there was any initial heterozygosity in a (proposed) first pair, but if there was, we don’t know how much or whether it had the same distinctive spectrum of mutational bias at the beginning. It could have. We assumed randomness for simplicity’s sake.
It would be an interesting question if 500,000 years was enough time to give an imprint of CpG bias like the modern genome has.
We also pointed out that our model cannot distinguish between a first pair, or a very small bottleneck.
Of course, we cannot go back 500,000 years ago and sample the genomes then. I don’t think we can sequence DNA that old yet. (45,000 years old, yes.) But I do not think a signal of the primordial alleles could be detected from such samples. Too much time has passed, and perhaps too many bottlenecks, and DNA degradation.
C to G transitions vastly outnumber any other kind of mutation. The fact that the pattern persists he takes as evidence of common descent. I take it as evidence of the strength of the mutational bias. We know that there are functional reasons for CpG islands. It is always a mistake to assume what is ancestral when that is what is being argued!
The modern genome of all mammals has this pattern. I think it more likely there is a strong drive in the mutational bias’s direction, or favored protection, but he could be right — it could be a signal of common descent. And a question for Steve to ponder — how did so many different mammalian body plans manage to arise using mainly CpG and not perturbing this pattern? If the pockmarks on the moon showed this kind of specific array surrounding each crater, we would think someone was using the moon for target practice.