Winston Ewert: The Dependency Graph of Life

Design

(S. Joshua Swamidass) #1

An interesting and signficant article was published at the Discovery Institute, by @Winston_Ewert : The Dependency Graph of Life, DOI: 10.5048/BIO-C.2018.3.

[NOTE: This thread has special rules; read and follow them or there will be consequences]

This article is being drummed up by a past interlocutor of @vjtorley and myself (@Cornelius_Hunter) here on this forum thread:

It has always been known that the common descent model fails, both from a long list of particular examples, as well as in systematic studies. But what might a better model be? Winston Ewert’s new paper (http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2018.3 11) presents a dependency graph model, inspired by computer science. The paper uses model selection methods, and shows that the dependency graph model is a far superior model compared to common descent.
https://discourse.biologos.org/t/new-paper-demonstrates-superiority-of-design-model/39018

I thought it would be worth discussing this in more detail here. I know that @T.j_Runyon and @cwhenderson are thinking about this article too. Skimming it, seems like there will be some good things to discuss here about genetics. I’ll invite Winston to join us here, and hopefully he will join us.


It is well known that I am a skeptic of ID. If Dr. Ewert joins us here, nonetheless, I insist we treat him with respect. Especially if you disagree with him, be sure to understand him before rebutting his position.

Cite this exchange with DOI: 10.5281/zenodo.1318762.


Side Comments on The Dependency Graph of Life
AJ Roberts: One of The Biggest Questions
What are Office Hours?
Swamidass and Cram: Common Ground?
Theological Premises in Design Arguments?
Did God Design or Craft Us?
BioLogos: Teaching Evolution to Students of Faith
Todd Woods on Ewert's Dependency Graph
Daniel Deen and Joel Oesch: The Lutheran Voice and Crosswise Institute
Beyond Reasonable Doubt? A Test for Common Ancestry
Should Scientists Dialogue With ID, YEC, and OEC?
The Creation Project Recognizes Peaceful Science
Uncommon or Common Descent?
(S. Joshua Swamidass) #2

I want to point out that this is one of the best articles I’ve read from ID. The article does not have the combativeness I’ve come (rightly or wrongly) to expect from ID. That is good.

The idea here is novel too. They propose an actual design principle. Look at the abstract:

The hierarchical classification of life has been claimed as compelling evidence for universal common ancestry. However, research has uncovered much data which is not congruent with the hierarchical pattern. Nevertheless, biological data resembles a nested hierarchy sufficiently well to require an explanation. While many defenders of intelligent design dispute common descent, no alternative account of the approximate nested hierarchy pattern has been widely adopted. We present the dependency graph hypothesis as an alternative explanation, based on the technique used by software developers to reuse code among different software projects. This hypothesis postulates that different biological species share modules related by a dependency graph. We evaluate several predictions made by this model about both biological and synthetic data, finding them to be fulfilled.

This is the first actual design model since Walter ReMines’ Biotic Message I’ve seen arise. That is important to acknowledge and respect. Without commenting whether this is a good explanation or not, the mere attempt to do the work of building and testing a mathematical model is to be commended.

A few initial thoughts:

  1. It will take time to process and think about what is being proposed here. It is possible that this gives a reasonable account of part of the data (nested clades).

  2. It is clear that this study has also entirely ignored much stronger evidence for common descent, which is not nested clades at all (see, for example, Common Descent: Humans and Chimps / Mice and Rats).

  3. Superficially, this seems to parallel some (on the surface) developments in population genetics. I’ll explain later, but just say it would be really interesting to see the results of their model on a negative control, human variation data.

Looking forward to seeing how this develops. For the ID advocates listening, I will do my best to be fair. Even if this only solves part of the problem from them (#1), the evidence for common descent remains very strong (#2), but this would still be an important advance for them. As I’ve said, I think #3 is an important control experiment. If the results for this could be shared, I would be very interested to see it. Scientific work is hard too, you do not expect to solve all the problems in one swoop. So, for that reason, I’m willing to see how this goes.

@pnelson, I saw you were acknowledged in this paper. This is one of the best contributions I’ve seen come out of ID (whether it holds up or not). I’ll be looking at this closely. I hope that Winston and you can engage with us here. Peace.

More later…


(S. Joshua Swamidass) #7

The part he seems simply wrong on is that this is a better fit than common descent. He has used a simplified model of common descent, and ignored large amounts of data and evidence. I also think his negative controls will fail. Even in regards to nested clades, for example, it seems he ignores neutral mutations. If that is true, this does not even solve the nested clade problem fully.

However, this is also head on addressing one (of many) important patterns in the data, at least in part. He has proposed a mathematical model. He is testing this model. Such efforts are almost unheard of in anti-evolution arguments. I might be able count the number of comparable attempts on one hand. In that sense, it is not merely a negative effort to poke holes, but a positive effort of model building. It is not full of errors of computing the “probability of X by natural processes” either. So this is a legitimate effort to engage the data, whether or not it pans.

It is also typical of ID to overstate what is happening, which is why Cornelius huffing and puffing is not helping Ewert. If they can correctly frame their results, actually engage with critique, they MIGHT be able to refine this into something workable for PART of the data. Then they have to find a way to deal with the REST of the data.

I will be fair to them. It is up to them to decide if they will engage.


(Jon Garvey) #8

It is an interesting article, and I’m glad you found it so too.

Whilst Ewert is clearly opposing his model to common descent, not only does he acknowledge that nested hierachies are a special case of his “dependency graphs”, but also that many ID supporters hold to limited common descent.

So in this initial work, it seems reasonable to oppose “good fit” to “bad fit”, even if one wishes in time to allow for common descent amongst significant other mechanisms which tend to be hidden by the concentration on nested clades. He makes clear the provisional nature of this paper.

My impression was that, assuming the validity of the results (and I too note the cautions Ewert himself raised) whilst common design is a good explanation for the re-use of gene families across unrelated clades, that could take the form of some form of teleogical process inherent within nature (where are you, @sygarte?). It might even be accounted for by supposedly ateleological processes like HGT on a much larger scale than usually envisaged.

So it would be wrong to sideline it because of its ID source, because at least the possibility of uncovering major new processes in evolution is there, which could account for what are currently, essentially, anomalies like molecular convergence, disparate cladistic phylogenies and so on.


(S. Joshua Swamidass) #9

I’m giving credit where credit is due. This fits into the category of ReMines Biotic Message, in that it is an actual model. It surpasses ReMine in that it is tested on data.

It seems to give one alternate and provisional explanation of some of the nested clade patterns, but not all. Nor does it engage the strong evidence for common descent. There are ways for them to emperically test their theory further.

At this point I’m waiting to see if they will engage. Even though it is a limited advance it might be real.


(Winston Ewert) #10

Hello all, I’m here for the moment.

As a general policy, I don’t engage in comment/forum threads. My problem is that once I start posting I very quickly become this guy:

image

And that’s just not helpful to anybody. If (probably when) I find myself becoming that guy I’m going to have stop participating in this thread both not posting and not reading it. That will not mean I’m not interested in well articulated carefully reasoned critiques. If you write one, feel free to e-mail them to me at evoinfo@winstonewert.com.

I will also note that BioComplexity would love to publish an official critique to my paper if anyone is interested.

I’m also going to have a zero-tolerance policy towards insults. If anyone starts calling me names, attacking me on irrelevant issues, etc, I’m going to leave. Putting up with abuse just isn’t worth my time.

Josh, thanks for your respect and acknowledgment of me actually trying to a build a scientific model. Although I’m sure we disagree on a lot, we probably largely agree as to this being a problematic deficit in the current state of intelligent design. In fact that respect you are showing is the big reason I’m breaking my general rule and engaging in this thread. Let’s hope I don’t regret that!

However, I must make one point of objection to Josh’s statements so far. He uses the word “ignore” repeatedly to describe factors/data I did not take into account. I’m not ignoring them, I’ve just limited the scope of what is being looked at in this first presentation of the model.

For any serious objection to my hypothesis, I probably won’t dispute it in this thread. If its a worthwhile objection I will want to take the time to think about, come up with a hypothesis to explain it, test that hypothesis etc. That will take time. I’m less interested in arguing the point as in understanding the objections to it so that it can guide where work on this hypothesis needs to go.

Josh states that I ignore much stronger evidence for common descent. I find that different people rank different lines of evidence differently. Some do seem to think nested clades is the strongest evidence. Regardless, it doesn’t matter. There many lines of evidence I have not attempted to touch. If my model is to succeed it does need to be developed to explain all those lines of evidence. I appreciate that’s a daunting task. But, my current paper only deals with nested hierarchy, and I’m going to ignore, for this thread, any bringing up over other lines of evidence.

To clarify this, common design is an alternative to common descent not an addition to it. The phrase common design is used by intelligent design proponents to refer to the idea that commonalities in different living things can be explained by a common designer instead of common descent. When I talk about common design not being a “critique of common descent” my point is that I’m not interested in simply pointing out places where common descent has trouble explaining the data. I’m interested in developing common design as positive model in its own right.

This is a good suggestion, its on my list of potential future related projects.

This is certainly true. In my paper I called it a “tree model” rather than a “common descent model” for exactly this reason. What I’ve shown is that the data fits a dependency graph better than a tree. But this could be explained if there is some mechanism operating in common descent that produces data that looks kinda like modules that are being picked up by my analysis.

My question for you is: what mechanisms do you see as good candidates?

I’m not sure here whether you are referring to:

  1. Lines of evidence for common descent besides nested hierarchy
  2. Data besides gene families that fit a nested hierarchy

In either case, it is true. I fully acknowledge there is a lot more data and evidence that needs to be taken account of. If 2, I’d be curious to know which data you think has the best chance of falsifying my hypothesis.

Here I’d like to know what you are referring to since selection of any sort didn’t enter into the analysis. Perhaps something about genes drifting through sequence space in a neutral fashion?


Uncommon or Common Descent?
Uncommon or Common Descent?
(Curtis Henderson) #11

Hello @Winston_Ewert, I’m only a part-time reader and rarely a contributor here, but I’d like to extend a welcome (for what it’s worth). I think you will be very pleasantly surprised by the unique nature of the dialogue here. “Peaceful Science” is a description, not an aspiration. Although the crowd isn’t huge, participants here are largely quite respectful of differing views. Moderators here prescribe to very different views on God’s creation with the intent of allowing ALL viewpoints safe voice. Gracious dialogue isn’t displayed 100% of the time, but pretty darn close to it. I hope we can all learn from your contributions at this site.

Edit - As a P. S., I think we’ve all been guilty of being “that guy” in the cartoon!


(S. Joshua Swamidass) #12

Thank you for coming @Winston_Ewert. I’ve long been asking @pnelson for models just like yours and @Agauger’s, that recognizable to me as a computational biologist. I meant it not as a taunt, but as a genuine invitation. I’m looking forward to the conversation and expect also to learn through it too.

Great. I endorse that fully. Let me lay down some ground rules for everyone.

  1. We also will have a zero-tolerance policy towards insults directed at you.

  2. Everyone watching this thread, be very cautious on entering this thread. If you do not treat @Winston_Ewert with basic respect, your posts will be deleted. Repeated infractions will get you booted.

  3. @Winston_Ewert if you see any posts that are a problem, please do not leave immediately. Instead, flag them as inappropriate, and let me (or the @moderators) lay down the law. You, however, do not need to engage with anyone rude to you at all. We will prevent this thread from being graffitied by trolls.

  4. If this becomes an ongoing issue, I will move this thread to a protected place where others can watch, but only designated people can contribute.

And everyone watching, if you do not have sensible and nice things to say, just stay out of this thread. If you can’t be kind, you are not welcome to interact with us on this thread. If on the other hand you have legitimate questions or critiques, you should feel free to put a note here. This is an open forum, but it only works when we are kind to one another.

Speaking of which, I apologize ahead of time for the mistakes I make here. When I make them, I will be apologize and make it right to the best of my ability. My goal @Winston_Ewert is to treat you fairly. Thank you for giving us a chance to think about this together with you.


(S. Joshua Swamidass) #13

I see in your comments that is absolutely correct. Consistent with this, and to your credit, you write:

@Winston_Ewert, you are earning some real trust here. I hear this as honest engagement. You are not misrepresenting your results.

I suppose it seems that others (Cornelius) seems to be ignoring this, but we’ll ignore that for this thread. We are talking to you, and I agree you are not ignoring it. You are building the first attempt in years to address the nested clades pattern.

If you are to succeed at replacing common descent as an explanatory model for biology, you are going to need to make several advances. It is entirely okay that you’ve limited yourself to a subset of the problem for now. I’ll stop saying “ignore.”

Sure, but I’m not talking subjectively. I’m talking from a mathematical point of view. We can envision models that can explain parts of the nested clade pattern (as you have) without common descent. Walter ReMine did just this, and I’ve privately wondered about this too. For other patterns, it is much more difficult to imagine a solution.

I’d say, in your defense, that a lot of really bad arguments for evolution have been advanced. Nested clades, because it is often advanced as if there are no homoplasies. There are homoplasies, and they are predicted by evolutionary science too. We do not expect, from an evolutionary point of view, for nested clades in nature to be perfect nested clades. In this, Walter ReMine was correct. I don’t doubt that you are correct that others think that nested clades is strong evidence, but often the precise way that argument is advanced is actually in scientific error, even before an anti-common descent rebuttal arrives on the scene.

It is depressingly bad for dialogue when fallacious arguments like that are allowed to persist. There is a way that nested clades is evidence for common descent, but not in the way that arguement if often explained. I’m sorry for that absurdity in the conversation. I wish I could fix it, but I can only really manage what happens here in this little corner of the internet :frowning:.

I want to give you a fair hearing here, and even get other legitimate and honest scientists to engage with you. Let’s give you a shot. My statements earlier about “ignoring” should all be transposed to caveats (with which you agree) that this only handles part of the problem.

Does that sound good?


@Winston_Ewert, as a scientist in the Church and a Christian in science, I want to publicly promise some things to you.

I will treat you fairly.

I will give ground when you are right.

I will publicly make known that of which you have convinced me.

From here, I’m going to work slowly through the specific points you’ve raised. As you have time, please fill in the details. I want you to get credit for what you do well and right here and now on this thread. Peace.


The Tangled Tree of Life
Uncommon or Common Descent?
(S. Joshua Swamidass) #14

Same here. This thread will be open indefinitely though. This is a place where you can figure out what type of experiments might convince skeptics. Even if it takes months to get that analysis done, we will still be interested. Coming to agreement on the experiments ahead of time will help us understand the results when they come. And this also gives you a forum to make negative results known too, which also will build trust in your work.

I can see a few:

  1. Incomplete sorting (which seems to be at play in the human data).
  2. Deletion and large scale genome rearrangement.
  3. The Birthday paradox.
  4. Introgression and/or hybridization after speciation.

Now, there are ways to test to what extent these things are affecting your results. The primate lineages are a good place to focus, because we have the most data there, and is most salient because it deals with human origins.

Once again, there seem to be ways to test your theory on the data versus these mechanisms. However, it seems that as written that it is not specified clearly enough to do this yet (at least as a third party reviewer). Though we can, for example, start to apportion specific cases into the different classes I just mentioned based on some tests of the data.

This, also, is where the human data becomes important. I think we both agree that humans are monophyogenetic. So if your non-tree model fits better than a tree model, that is an important failed control. Without getting into the details yet, we already know that tree models fail on human diversity data. There have been several papers put out demonstrating this. We should get into the weeds on this I am sure, but that seems to indicate that common descent in the real world does not produce a tree, so your tests themselves are not demonstrating that your model is better than common descent. Where am I going wrong in that reasoning? And do you want to see some examples of what I am talking about?

Selection is not really the issue I’m referring to here. That has to be considered carefully too.

Instead, I’m referring to, for example, synonymous mutations in proteins. It seems (though I could be wrong), you’ve restricted this to gene families. If that is the case, you are glossing over mutations more likely to be neutral. That seems to be a real problem. As this is actually where the nested clade evidence is stronger (at least from my assessment).

If I am right (and I may not be), it seems this is a direct falsification of the conceptual argument being put forward. It seems that “design modules” must be defined to be non-neutral, and you may have an explanation for why they almost fit in a nested tree. However, that does not explain why more neutral mutations (it is a relative term) might fit more tightly in a nested three than design modules. That seems to be a looming problem for your proposal. I don’t think you can make your case without dealing with this head on. It seems to be a direct falsification of your proposal, unless I’m missing something here.

Thanks for the offer. I’m more interested in real dialogue with you. I want you to have the best model possible, and get credit for retracting it if its wrong. If it is wrong, maybe the next idea you come up would work. Dialogue like this, I’ve found, is a much faster way to help you out.

I’m not looking for a publication out of this, but to really help us all figure this out together. In a way, think about this forum like a “micro publication,” or a “public peer-review.” It is a public thread. If desired, I can even get a DOI for it so you can reference it.


(S. Joshua Swamidass) #16

In service of this goal @Winston_Ewert, can you make available to us the results you computed for this study? In particular, I want like to see the full dependency graph you compute, in all its gory detail. I’m sure you have them in text files somewhere. I’d like a copy of it, with a reasonable README file.

This data will not be used for “gotcha” argument silliness, but for legitimate scientists outside your camp to understand what you have done and what your data is telling us.

I also want to reiterate that even if there are some problems ultimately with your model, this still represents a major movement in a good direction for ID. We expect that some models will fail. We get credit for retracting them when they fail. Perhaps next time around, with lessons learned, a better model can be put forward. Maybe after some long hard work, you might even win.

Whatever happens with this model, @Winston_Ewert, this is a big win for you. I hope to see more more work like this come from your camp, even though it is not my camp.


(S. Joshua Swamidass) #17

2 posts were merged into an existing topic: Cornelius Hunter: Arguments Against Common Descent


Cornelius Hunter: Arguments Against Common Descent
(Sy Garte) #22

My question for @Winston_Ewert is whether any other statistical methods were tried to compare the fit of these databases by the DG model and the tree model? If so, what results were obtained.


Side Comments on The Dependency Graph of Life
(S. Joshua Swamidass) #23

@Winston_Ewert, in my experience @sygarte has been an honest and fair scientist. He is convincible, and is not aggressively opposed to you. It is worth your time to engage with his questions. It also looks like @glipsnort is joining the conversation. He also has been honest too, and really should be engaged. Right now there are three legitimate scientists from secular institutions engaging your work with an open mind. Congratulations. You have our attention.

@sygarte, in this specific case can you elaborate why you are asking this question? What will this tell you? What information will it give you? I’m not sure where you are going with this (though I have one guess).


(Sy Garte) #24

I find the very large differences in data fitting probability given in Table 4 surprising, and I was wondering if other methods would show similar large differences between the two models.


Uncommon or Common Descent?
(S. Joshua Swamidass) #25

That is not surprising to me at all. The key thing is to look at the proportion of the difference to the total. Let’s just take one line as an example:

Dataset Dependency Graph Tree Difference
UniRef-50 6,193,801 6,308,988 111,823

You are saying the 111,823 is large, but that is only (approximately) 1.7% of the unexplained fit (111 / 6308). That means the dependency graph only explains 1.7% more of the data’s patterns than a tree. Not very much. And, as @Winston_Ewert correctly notes, this is not even a real model of common descent.

So why are the numbers so large? Merely because he has a lot of data. Increasing the data will arbitrarily increase the absolute values of the log probability, but the relative values should remain somewhat stable.


Side Comments on The Dependency Graph of Life
Uncommon or Common Descent?
Uncommon or Common Descent?
(Sy Garte) #26

Thanks Josh. As a matter of fact, I find the module idea very interesting. And I was pleased to see that Dr. Ewert acknowledged what I see as the main objection to the conclusions. He writes:

“…the dependency graph model has an advantage over common descent in fitting the data because it can postulate modules to explain otherwise inexplicably distributed gene families…This is why we must also take into account the parsimony or complexity of the model.”

I find that to be a very honest statement of the problem in comparing the two models. But I am philosophically not in agreement with the solution using parsimony. My general attitude toward parsimony in biology is negative, since Occam’s Razor is violated at almost every turn in biochemistry and physiology.

What I think would be fascinating would be an incorporation of the gene module approach in an explanation of convergence within the evolutionary framework.


The Dissent from Darwinism
(S. Joshua Swamidass) #27

True, however, @Winston_Ewert penalizes more complex models. We can debate whether he did that correctly or not, but this objection is not really valid without a careful review of his penalization. I think a better question concerns the fit we would get from an undirected graph model versus a dependency graph (which is a directed graph), using the same penalization.

That is an important control to do. If it comes out wrong, it would undermine his argument significantly if the dependency graph model does not do better than the undirected graph. That, it seems, would be a speed bump for his proposal. As a matter of fact, we already know that human diversity data better fits an undirected graph than a tree. So the real question is if the dependency graph does better than an undirected graph, not just better than a tree.

It might be resolvable even if it initially fails this control. Perhaps the penalization function would need to be improved. It is, nonetheless, an important control study to run.

(side question to @Winston_Ewert, do the numbers in Table 4 include the penalization factor too?)


(Winston Ewert) #28

Firstly, I see that I need to clarify the nature of the argument I made in the paper.

If my hypothesis is correct, this predicts that a dependency graph ought to be a better fit to the biological data than a tree. This prediction is fulfilled, thus providing some level of evidence that my hypothesis was correct. My argument is not that because the dependency graph model beats the tree model that the dependency graph model is correct. Such an argument would not be valid. Instead, I’m merely arguing that this fulfills a prediction.

The challenge this leaves to common descent is explaining why this prediction worked.

I expected 1 and 4 as obvious candidates (and mentioned them in my paper).

As for 2, I do have deletions in my model. But I’m curious about how you see large scale genome rearrangement playing into this. Since I’m just looking at the presence or absence of gene families, I’d think a rearrangement wouldn’t do anything interesting there. But presumably you know something about that which I don’t.

As for 3, it seems to me that this should be taken care of by the probabilistic analysis. I assume what you are thinking here is that some genes could end up in a similar set of species and thus look a lot like a module, but by pure coincidence. But the Bayesian analysis, and in the particular the penalty for the dependency graph should prevent that happening.

What I’m surprised by is you not bringing up horizontal gene transfer. Do you not think it is a good candidate?

My thinking is that none of these mechanisms seem like good candidates to explain my successful prediction. Obviously, my intuition on this point is worth diddly squat. It has to be backed up by cold hard evidence which I don’t have (yet).

So, yes, dealing with the exact sequence (instead of just gene family) and in particular the more neutral elements of that sequence is really key. If that can’t be done my proposal fails. It remains to be seen whether a model can be developed here.

It should be emphasized, the fact that human variability deviates from the expectations of tree does not automatically mean that it will fit a dependency graph better. So its very much an open question as to what the results will look like.

But more critically, I’m not sure what prediction I would make about the results of the test. Whether or not common descent produces a tree depends on various assumption you make about the evolutionary process. I suspect that neither a tree or a dependency graph is the right model in this case.


Side Comments on The Dependency Graph of Life
Uncommon or Common Descent?
Uncommon or Common Descent?
(Steve Schaffner) #29

Without going into the details of the models, here’s my understanding of the situation: given a set of N species and a set of gene families, any gene family that appears in more than one species and in less than (N-1) species can contribute to the comparison of the two models. If the gene family appears only in a subclade of the full set of species, or is missing only in a subclade, then it is consistent with both models. If its presence or absence is not consistent with a single subclade, then it is improbable under the simple tree model but still probable under the dependency model. (And the dependency model is penalized for its extra degrees of freedom.)

Is my summary accurate? (If not, you can probably ignore the rest of my comments.)

If so, then it seems to me that this kind of comparison is critically dependent on the completeness and consistency of the dataset, since missing data in more than one species appears as a signal for one of the models. Comparative genome sequence data is typically typically come from independent sequencing projects with different degrees of completeness and accuracy, so the issue is particularly severe for this test.

If it were my study, the first thing I would want to do is understand the completeness of the data. For the case of the closely related fish, for example, how many gene families are there in total? How many are missing from a single species? How consistent is this number from species to species? Is there a correlation between the number of singleton missing gene families and the number of shared missing gene families, when assessed across species? (These would be good numbers to post here, by the way.)

The second thing I would absolutely, positively do – and would insist on an author doing if I were reviewing a study like this – is look at some of the genes that are supposedly missing (in a way not consistent with common descent) and confirm that they are really are missing genes and not missing (or different annotations. What’s in the genome where they should be, based on related species that have them?

Steve Schaffner


Uncommon or Common Descent?