Winston Ewert: The Dependency Graph of Life

That is not surprising to me at all. The key thing is to look at the proportion of the difference to the total. Let’s just take one line as an example:

Dataset Dependency Graph Tree Difference
UniRef-50 6,193,801 6,308,988 111,823

You are saying the 111,823 is large, but that is only (approximately) 1.7% of the unexplained fit (111 / 6308). That means the dependency graph only explains 1.7% more of the data’s patterns than a tree. Not very much. And, as @Winston_Ewert correctly notes, this is not even a real model of common descent.

So why are the numbers so large? Merely because he has a lot of data. Increasing the data will arbitrarily increase the absolute values of the log probability, but the relative values should remain somewhat stable.

2 Likes