Controversy Arising: Timetrees Unconstrained

Two theorists have caused a stir in evolutionary circles, claiming to have proven that Darwinian phylogeny efforts (tree-building) cannot be constrained to one “best” answer. In fact, any proposed tree is no better than an infinity of other trees. They can’t see the tree for the forest.

The theorists are Stilianos Louca, biologist at the University of Oregon, and Matthew W. Pennell, evolutionary biologist at the University of British Columbia. Their paper that started the controversy was published in Nature, “Extant timetrees are consistent with a myriad of diversification histories.” A timetree is a phylogenetic tree supposedly calibrated by the appearance and disappearance of organisms. An extant timetree is a timetree calibrated using living organisms. The news from the University of Oregon, “Researchers find flaws in how scientists build trees of life,” sums up the paper’s thesis that “long-used approaches for reconstructing evolutionary paths are deeply flawed.”

While paleontology provides insights on how and why patterns of biodiversity have changed over geological time, fossils of many organisms are too scant to say anything, said Stilianos Louca, an assistant professor in the Department of Biology and member of the UO’s Institute of Ecology and Evolution. An alternative approach that relies on signals of identifiable changes in an organism’s genetic makeup also can be misleading.
“Our finding casts serious doubts over literally thousands of studies that use phylogenetic trees of extant data to reconstruct the diversification history of taxa, especially for those taxa where fossils are rare, or that found correlations between environmental factors such as changing global temperatures and species extinction rates,” Louca said, using a term for populations of one or more organisms that form a single unit. [Emphasis added.]

A House of Cards

Needless to say, it can be upsetting to find that a method that has been trusted in thousands of studies is a house of cards. Pennell himself knows the feeling.

“I have been working with these traditional types of models for a decade now,” Pennell said. “I am one of the lead developers of a popular software package for estimating diversification rates from phylogenetic trees. And, as such, I thought I had a really good sense of how these models worked. I was wrong.”

Evolutionists have approached timetrees from two directions. First is the paleontological approach. A researcher divides evolutionary history into time periods, puts fossils into those time periods, and tries to estimate the number of species that have appeared and disappeared in each period. The second or “phylogenetic” approach tries to estimate relationships between species from their genomes, “finding a speciation-extinction scenario that most likely generates a phylogenetic tree.” There should be a best match between the two approaches. Unfortunately, too much information is lost to be able to constrain the possible trees to a single best one. The paper says,

Here we clarify the precise information that can be extracted from extant timetrees under the generalized birth–death model, which underlies most existing methods of estimation. We prove that, for any diversification scenario, there exists an infinite number of alternative diversification scenarios that are equally likely to have generated any given extant timetree. These ‘congruent’ scenarios cannot possibly be distinguished using extant timetrees alone, even in the presence of infinite data.

The thud you just heard is the sound of evolutionary modelers falling off their chairs onto the floor. Picking themselves up, they realize they have a criticism that cannot go unchallenged. They’ve spent too many years building up an “impressive suite of computational methods” to throw it all away at the thought it might be all for nothing.

Save the Sacred Timetrees

Writing at bioRxiv, Hélène Morlon, Florian Hartig, Stéphane Robin responded, trying to save the sacred timetrees from this forest fire. In their paper, “Prior hypotheses or regularization allow inference of diversification histories from extant timetrees,” they called Louca and Pennell’s finding “certainly both interesting and unfortunate” — but is it devastating?

Phylogenies of extant species are widely used to study past diversification dynamics. The most common approach is to formulate a set of candidate models representing evolutionary hypotheses for how and why speciation and extinction rates in a clade changed over time, and compare those models through their probability to have generated the corresponding empirical tree. Recently, Louca & Pennell reported the existence of an infinite number of ‘congruent’ models with potentially markedly different diversification dynamics, but equal likelihood, for any empirical tree…. Here we explore the implications of these results, and conclude that they neither undermine the hypothesis-driven model selection procedure widely used in the field nor show that speciation and extinction dynamics cannot be investigated from extant timetrees using a data-driven procedure.

These authors accuse Louca and Pennell of using the wrong philosophical method of science. They tried to figure out speciation and extinction rates from the data. They should have started with a hypothesis, like most evolutionists do. This could have allowed the researcher to tweak parameters for a better fit, such as proposing an “early burst” of speciation to fit the theory of adaptive radiation. “Louca & Pennell’s congruent models M*, on the other hand, do not correspond a priori to any evolutionary hypotheses, and would never be considered in a well-conducted hypothesis model selection procedure in the first place.” How dare they think that science could find an answer to the history of life directly from the evidence! One must start with prior knowledge, i.e., the belief that actual histories of diversification fit the accepted timeline. Then, the proper model appears. Usually it appears, at least. Modelers need job security by never claiming to have found the ultimate right answer.

The existence of a large number of congruent models therefore poses no direct challenge to the traditional hypothesis-driven research approach. The only possible concern is the question of model selection consistency: if the true model is not in the set of considered models, do we select the correct hypothesis? This question has not been answered one way or the other and would require thorough investigation in future research.

“Full of Controversy”

Below the paper, one commenter says that “this thread is already full of controversy,” suggesting that the issue is generating a lot of buzz among evolutionists about this paper “that caused such a stir earlier in the year.” Undeterred, Louca and Pennell think such setting of “arbitrary constraints” is “rarely justified biologically.” Their earlier paper pointed out serious implications for macroevolution:

Because any given true diversification history (even a relatively simple one) is unlikely to exactly match the particular functional form considered, fitting the latter may not even approximately yield the true diversification history. The existence of congruent scenarios can thus seriously alter macroevolutionary conclusions.

Now they are back with another paper pointing out more flaws in traditional Darwin tree-making. Last week at bioRxiv, they explained, “Why extinction estimates from extant phylogenies are so often zero.” They accuse modelers of carelessly setting biologically meaningless parameters and then wiping uncomfortable truths under the rug.

Time-calibrated phylogenies comprising only extant lineages are widely used to estimate historical speciation and extinction rates. Such extinction rate estimates have long been controversial as many phylogenetic studies report zero extinction in many taxa, a finding in conflict with the fossil record. To date, the causes of this widely observed discrepancy remain unresolved. Here we provide a novel and simple explanation for these “zero-inflated” extinction rate estimates, based on the recent discovery that there exist many alternative “congruent” diversification scenarios that cannot possibly be distinguished on the sole basis of extant timetrees. Consequently, estimation methods tend to converge to some scenario congruent to (i.e., statistically indistinguishable from) the true diversification scenario, but not necessarily to the true diversification scenario itself.

Louca and Pennell point out that the fossil record is full of extinctions. How can one presume “zero extinctions” when the majority of organisms that ever lived are extinct? It gets worse:

This congruent scenario may in principle exhibit negative extinction rates, a biologically meaningless but mathematically feasible situation, in which case estimators will tend to hit and stick to the boundary estimate of zero extinction. To test this explanation, we estimated extinction rates using maximum likelihood for a set of simulated trees and for 121 empirical trees, while either allowing or preventing negative extinction rates. We find that the existence of congruence classes and imposed bounds on extinction rates can explain the zero-inflation of previous extinction rate estimates, even for large trees (1000 tips) and in the absence of any detectable model violations. Not only do our results likely resolve a long-standing mystery in phylogenetics, they demonstrate that model congruencies can have severe consequences in practice.

Another thud is heard, followed by angry shouts. Not all are angry, though; many of the comments retweet this response, “I love the clarity of this paper.”

The Take-Home

Models, which are common in science, are useful but not always realistic. Assumptions and prior beliefs can cloud a hypothesis-driven research project, providing unwarranted confidence in a model that may not have anything to do with true history. It’s also a warning to any scientist, including proponents of intelligent design, to beware of letting assumptions cloud one’s conclusions. Look how modelers can say, “Well, we know a negative extinction rate is meaningless, so we’ll just round it up to zero.”

The controversy arising about timetrees is something to think about when hearing confident-sounding presentations about the history and evolution of life. When scientists speak glibly about adaptive radiations, early bursts of diversification, global extinctions and all the rest, what do they really know? They weren’t there. They take bits of bone, molecules from eye of newt and bat wing, and conjure up fantastical scenarios of an evolving world of universal common ancestry driven onward and upward by natural selection alone. But if the model is just one of an infinite number of congruent timetrees held together by unrealistic adjustments, the world picture may never have existed except in the crystal ball of the imagination.