What Line of Evidence is Strongest for Evolution?

You. Discordant mutations are a way of assessing how tree like a set of sequences are.

Not sure what method you’re talking about there. In the parsimony criterion, inferred homoplasy increases the length of the tree; since the shortest tree is best, you end up with the tree that requires the least homoplasy from the data. In that sense, inferred homoplasy would be “discordant” mutations. In the likelihood criterion, some homoplasy actually makes the data, given the tree, more likely, as zero homoplasy would be extremely unlikely on a tree of any reasonable length. So it’s unclear what “discordant” would mean in that context. Too little homoplasy or too much?

Lineage sorting, horizontal transfer, and concerted homoplasy are whole separate questions. Most phylogenetic algorithms don’t take them into account, though there are methods that do, though not all together.

2 Likes

And I do also. @mung immerse yourself in the technology. You will learn something grand.

3 Likes

This is excellent. Thanks John.

The various trees that the program creates, I have referred to them as “phylogenetic trees.” Was I mistaken in using that terminology to refer to the generated trees?

Is it only after a tree has been selected which “turns out to be a significantly better fit to the data than other trees” that we have a phylogenetic tree?

How many trees do we have and how many phylogenetic trees do we have and at what point does a tree become not just a tree but a phylogenetic tree?

This may help clear up some matters.

1 Like

I’d just like to express appreciation for everyone who has stuck with it. We’ve had some bumps but we’re actually having what I think is a constructive conversation that can benefit people who may not be familiar with the science and are having it explained by scientists who are.

3 Likes

No, that’s what we call them. But of course they’re actually only estimates of the real phylogenetic tree, the one that actually shows the relationships among species, corresponds to the real history, or however you want to say that. Don’t get hung up on terminology. If a tree is significantly better than others, that gives us more confidence that the estimate is correct, and incidentally that common descent is a real thing.

2 Likes

I’m trying not to, but others seem to be concerned that I have some mistaken notions. To your credit you are helping clarify matters.

So when I say that a phylogenetic tree is not prima facie evidence for common descent, you and I both understand what I am talking about and you agree with me on that point?

1 Like

I was considering starting a topic at TSZ to talk to Joe. I’m interested in the logic that goes into writing such a program. In a sense that’s what I’e been trying to get at in this thread. It’s not about whether CD is true or false, it’s about what goes into writing a program. Yeah, Yeah, I know it doesn’t come of that way. :slight_smile:

1 Like

Books I am using as reference material:

Molecular Evolution: A Phylogenetic Approach

Molecular Evolution and Phylogenetics

Reconstructing the Past: Parsimony, Evolution, and Inference

2 Likes

Yes, that’s right.

4 Likes

I took a short online course in computational molecular evolution a few years ago, and the instructional videos are available online here, where some common tree building algorithms are explained pretty well in my opinion:
Maximum Parsimony
Maximum Likelihood
There are more videos hosted by that youtube channel on different algorithms and other aspects of molecular evolution, such as genetic drift, detection of selection etc.

As a layman it’s helped me understand a lot of things much better and correct some misconceptions I’ve had.

3 Likes

When you say “writing such a program”, is correct to say you do not mean the design, coding, or UI, but rather the requirements and in particular the biological models and the associated statistics. It’s not the program itself that is helpful, but rather the underlying science and (shudder) math. Is that characterization fair?

I’d be leary of TSZ. The resulting comments likely would just degenerate into something like the Nested Hierarchies thread from July that had a very low signal to noise ratio in its 1000+ comments.

Maybe Joe would post something here, where the moderators try to manage that by shunting such comments to side threads.

2 Likes

Yes. Being able to look at the source code, or write the code, would just be a tool to understanding.

Are the phylogenetic trees created first, and then a p-value is associated with the phylogenetic trees? Is every single phylogenetic tree that is generated considered evidence for common descent, or just some of them, that meet or exceed some other criteria?

I think John has answered this and that it’s the latter that is the case. So I don’t believe I have been at all “working from very basic misconceptions.”

Great idea. Would you like to invite him? We can keep things on track here.

Those questions are not answerable in a direct way because there are multiple methods. The each work in different ways, and might yield different answers for those questions.

I agree. Not every method uses p-values, for example. But surely they all use some means of ranking. We can move to a higher level of abstraction. It would be interesting to see whether a common abstraction can be found.

I do not in fact know of a method that assigns a p value to a tree. Nor do I know of a phylogenetic method that uses a null model to assess any single tree.

Correct. The direct outputs don’t contain p values, they contain scores. But the scores can be translated to p values:

2 Likes

I don’t believe either of those papers is about phylogenetic analysis, which is the supposed subject.

As I understand it, it is most common to establish confidence of specific nodes, not the whole tree.