Nested Hierarchy and Bootstrap Values

FWIW, I was interested in related questions, so I generated three sequence data sets using a version of the simulation described here: X-Men Constructive Neutral Evolution

(More details available via this blog series which starts here.)

The first data set involved multiple populations, each with a random starting genome.

The second data set involved multiple populations all starting from the same initial genome.

The third data set involved a single starting population sampled at various times.

In the third chart, the color of the labels represents the generation number, to indicate that the tree not only has high bootstrap values but also in fact reproduces the relationships between the sequences (since similar colors are closer together on the tree).

Crucially, all sequences in all trees represent full solutions to the same problem. So they are functionally equivalent and satisfy the same design requirements. Yet they do not all possess the same degree of phylogenetic signal.

All analysis was done in R using the NJ, optim.pml and bootstrap.pml functions from the phangorn package.

This was a learning exercise for me as well as hopefully an illustration of the differences between what is expected from convergent evolution, unnested divergence from a common starting point, and common descent. So if there are ways to improve or if some version of this would be helpful to someone, I’m happy to discuss.

4 Likes