Evolution Icon Evolution
Intelligent Design Icon Intelligent Design

Do Statistics Prove Common Ancestry?

Photo credit: Anna Dudkova via Unsplash.

Over the past year, various Discovery Institute staff, fellows, and Summer Seminar alumni have participated in a journal club reading papers related to systematics and phylogenetics. One of the papers we read was by Baum et al. (2016), “Statistical Evidence for Common Ancestry: Application to Primates.” The idea behind the paper originated in 2014 when David A. Baum, Cecile Ane, and Bret Larget taught a graduate seminar at the University of Wisconsin, Madison, where they are all faculty members. Given the importance of common ancestry to evolutionary theory, their motivation was to curate and convert supporting evidence for common ancestry into quantified statistics. They focused on primates because they believe this to be a focal point of the evolution debate.

A number of scientists, myself included, analyzed their methods, and had a group discussion regarding their conclusion: “Primate common ancestry is an overwhelmingly well supported hypothesis.”

General Thoughts

Our group selected this paper because the proposition — to test two alternative models of common ancestry — is both uncommon in the literature, and fascinating. They investigated two versions of separate ancestry, which they defined as “species SA (the separate origin of each named species) and family SA (the separate origin of each family).” We didn’t agree with everything in the paper, but we appreciated that it was easy to follow, insightful, and respectful. Some group members also appreciated that both morphological and molecular characters were considered. While not everyone agreed with the conclusion, most thought that the paper’s statistical methodology for “quantifying” historical science was useful.

Also, before we dig in, I want to note that intelligent design is not necessarily incompatible with common ancestry. In fact, our journal club had ID proponents with a variety of perspectives on common ancestry — some supportive, some skeptical, and some agnostic. 

Separate Ancestry Model Does Not Represent “The Other Side’s” View

The biggest issue we saw with the paper was that the alternative models for common ancestry — species ancestry and family ancestry — are not accurate representations of “the other side of the debate,” i.e., intelligent design proponents who question common ancestry. Here’s why.

In essence, their comparisons asked whether the similarities between organisms that form the basis for phylogenetic comparisons could have arisen by chance or common ancestry. If common ancestry was a more likely explanation than chance, then they concluded that common ancestry was supported. But, no one is suggesting that chance would produce the similarities. For the ID proponent who questions common ancestry, similarities would be produced from design. Even the authors noted that such a test is biased in favor of common ancestry:

Most of the statistical tests we discuss are epistemologically asymmetric. They involve identifying a pattern that is expected under CA [common ancestry] and then quantifying the probability that the observed data could have arisen by chance under SA [separate ancestry].

What emerged from our discussion is that this model of separate ancestry is not endorsed by anyone in the ID community and, to put it bluntly, is wildly unrealistic for all kinds of biological reasons. For example, this separate ancestry model would have come as a great surprise to Carl Linnaeus, Georges Cuvier, or Luis Agassiz, who organized taxonomic groups around shared similarities, without any necessary causal requirement for evolutionary descent.

Their conclusion —  “We overwhelmingly rejected both species and family SA with infinitesimal P values” — is unsurprising and represents a non-test of an actual SA model. Accordingly, this conclusion presents no challenge to ID proponents who question common ancestry and the discussion might be best summarized as “talking past each other.” One key takeaway for ID proponents is that they need to clarify what their model of separate ancestry actually is, so others can test it. I am going to try to do a bit of that, conceptually, right now.

What Can Similarity Tell Us About History? Hint: Not Much

The authors make the assumption that similarities between primates entail historical relationships. “[O]rganisms [that] share similarities, particularly similarities that would be very unlikely to arise independently, provide … evidence in favor of CA (Sober and Steel 2015).” This assumption is standard for historical phylogenetics and led the authors to design the species separate ancestry model with all similarities arising independently: “A key feature of the species SA model is that for each character, the state drawn by each species is independent of that drawn by other species.” 

This model and assumption are highly problematic from the ID perspective. Design may cause striking similarities, without any historical or evolutionary relationship.

Consider a scenario where there are three German Shepherds: a mother, her son, and a third that is a genetically engineered clone of the son, born in a laboratory womb. The genetically engineered German Shepherd in this imaginary scenario is genetically identical to the real son and phenotypically similar, but has no historical relationship with the mother. Instead he is a product of human genetic engineering.

I describe this scenario, because if there are mechanisms beyond historical relationships that could account for genetic similarity, i.e., genetic engineering, then it is no longer possible to assume that similarity must infer historical relatedness. Although in this case the mother’s existence is necessary for the clone, it is not sufficient to explain the clone’s existence or its similarities. It would be incorrect to describe the third German Shepherd as the historical descendant of the mother, just as Craig Venter’s Syn3.0 cell, based on a Mycoplasma strain, would not exist without the careful design of human molecular biologists and geneticists.

Thus, the assumption that ancestry is the only mechanism or best explanation for character similarity is not held by the ID proponent. Instead, ID proponents hold that a designer may produce similarity, much like different Gucci purses exhibit similarities. A more technical explanation of one ID model for separate ancestry can be found here

Cherry Picking?

Before doing statistics, the authors selected which characters (genes and morphological features) they would use for their analysis. How did they pick? For the molecular dataset, that isn’t clearly stated in their paper. But from reading elsewhere (Perelman et al. 2011; Murphy et al. 2001) we discovered that primers from earlier studies were used as well as some new ones. Checking out the first citation, we found the following.

We examined sequence variation in 18 homologous gene segments (including nearly 10,000 base pairs) that were selected for maximal phylogenetic informativeness in resolving the hierarchy of early mammalian divergence.

Perelman et al. 2011; Murphy et al. 2001

Why are these sequences phylogenetically informative? Because they resolve the hierarchy of early mammalian divergence in the way the model anticipates. Several participants raised problems with this approach during the discussion. They argued that data selection or exclusion, based on resolution towards an expected model, stacks the deck. One participant said that if you are going to use genetic similarity to make a case for a historical relationship, then there must be consideration of all DNA (which raises other problems) and a penalty for misfits that don’t fit the expected framework. In other words, orphan genes should score against the hypothesis of historical relatedness.

Conclusion

To summarize, the methodology and desire to test alternatives to common ancestry raised in this paper are admirable. However, a number of common assumptions were made that proponents of separate ancestry with a design perspective would not endorse. Namely, the paper fails to recognize that design can generate similarities independent of common ancestry. This renders the majority of the paper a case of “talking past one another” while avoiding the real issues about whether a design-based model is actually a better fit for the data. What should happen next? More dialogue between these two camps would be helpful so that appropriate models can be created and then statistical tests applied.

Sources

  • Baum, David A., Cécile Ané, Bret Larget, Claudia Solís-Lemus, Lam Si Tung Ho, Peggy Boone, Chloe P. Drummond, Martin Bontrager, Steven J. Hunter, and William Saucier. 2016. “Statistical Evidence for Common Ancestry: Application to Primates.” Evolution. https://doi.org/10.1111/evo.12934.
  • Murphy, W. J., E. Eizirik, W. E. Johnson, Y. P. Zhang, O. A. Ryder, and S. J. O’Brien. 2001. “Molecular Phylogenetics and the Origins of Placental Mammals.” Nature 409 (6820): 614–18.
  • Perelman, Polina, Warren E. Johnson, Christian Roos, Hector N. Seuánez, Julie E. Horvath, Miguel A. M. Moreira, Bailey Kessing, et al. 2011. “A Molecular Phylogeny of Living Primates.” PLoS Genetics 7 (3): e1001342.