Gpuccio: Functional Information Methodology

Before stars and tornadoes, there are other important questions I would recommend. The question of negative controls is central here.

For example I’m still not clear cancer will work as a negative control for you. If showed cancer has an increase in FI, would that demonstrate FI is not a good way to determine design? Or would you just conclude this is evidence cancer is designed? If the former, we have something to talk about. If the latter, we may have reached epistemic closure.

1 Like

@gpuccio there are several dimensions to this conversation. Would you like us to start a few threads dedicated to each dimension? That might enable a more orderly conversation, so key points are not dropped. Each thread could process at its own pace too?

What do you think?

I am not equating them. In most cases, the correct model is a random walk. However, probabilities for a random search and a random walk from some unrelated state, given a high number of attempts, are similar. Many people are more familiar with the concept of a random search. I suppose that, for frameshift mutations, it’s more a random search. However, I had no intention to equate the two things.

With new FI, the model is a random walk from an unrelated state. FI expresses well the probability of finding the target. It is true that a minimum number of steps is necessary to make the target possible, but here we are discussing billions of steps, and extremely improbable targets. The initial difference is not really relevant.

Yes. Definitely.

We have something to talk about. Definitely.

That would be too complicated for me. Thank you for the proposal, but I think we should go on this way. If you have patience, I will try to deal with all the relevant issues.

1 Like

Thanks again, @gpuccio. I would like to commend you for broaching this subject, as it stands in contrast to the approaches used by the ID vanguard. I have long been of the opinion that the relevant metric that ID proponents should be measuring is something akin to informational work, which may be like what you describe here. I suspect that there are serious issues with the approaches one may take to estimate this property, but the concept seems to make sense to me.

1 Like

But evolution is not a random walk!! It is guided by many things, including natural selection. If you neglect selection, you are not even modeling the most fundamental basics.

There are other deviations from the random walk model too. Evolution is also not all or none, but demonstrably can accumulate FI gradually in a steady process. I could go on, but you need a better model of evolution.

2 Likes

I start with you, because at least I have not to show meteorologic abilities that I do not posses! Art’s tornadoes will be more of a challenge. :slight_smile:

I am not sure if the problem here is a big misunderstanding of what FI is. Maybe, let’s see.

According to my definition, FI can be measured for any possible function. Any observer is free to define a function as he likes, but the definition must be explicit and include a level to assess the function as present. Then, FI can be measured for the function, and objects can be categorized as expressing that function or not.

An important point is that FI can be generated in non design systems, but only at very low levels. The 500 bit threshold is indeed very high, and it is appropriate to really exclude any possible false positive in the design inference.

I think that I must also mention a couple of criteria that could be important in the following discussion. I understand that I have not clarified them before, but believe me, it’s only because the discussion has been too rushed. Those ideas are an integral aprt of all ID thinking, and you can find long discussions made by me at UD in the past trying to explain them to other interlocutors.

The first idea you should be familiar with, if you have considered Dembski’s explanatory filter. The idea is that, before making a design inference, we should always ascertain that the configurations we observe are not the simple result of known necessity laws. For the moment, I will not go deeper on this point.

The second point is about specification, not only functional specification, but any kind of specification. IOWs, any type of rule that generates a binary partition in the search space, defining the target space.

The rule is simple enough. If we are dealing with pre-specifications, everything can work. IOWs, let’s take the simple example of a deck of cards. If I declare in advance a specific sequence of them, and then I shuffle the cards and I get the sequence, something strange is happening. A design inference (some trick) is certainly allowed.

But if we are dealing with post-specifications, IOWs we give the rule after the object had come into existence and after we have observed it, then the rule must be independent from the specific configuration of bits observed in the object. Another way to say that is that I cannot use the knowledge of the individual bits observed in the object to build the rule. In that case, I am only using an already existin generic infomration to build a function.

So, going back to our deck of cards, observing a sequence that shows the cards in perfect order is always a strange result, but I cannot say: well, my function is that the cards must have the following order, and then just read the order of a sequence that has already been obtained and observed.

This seems very trivial, but I want to make it clear because a lot of people are confused about these things.

So, I can take a random sequence of 100 bits and then set it as electronic key to a safe. Of coruse, there is nothing surprising in that: the random series was a random series, maybe obtained by tossing a fair coin, and it had no special FI. But, when I set it as a key, the functional information in that sequence becomes 100 bits. Of course, it will be almost impossible to get that sequence by a new series of coin tossing.

Another way to say these things is that FI is about configurations of configurable switches, each of which can in principle exist in at least two different states, so that the specific configuration is the one that can implement a function. This concept is due to Abel.

OK, let’s go back to your examples. Let’s take the first one, the other will probably be solved automatically.

The configuration of stars in the sky.

OK, it is a complex configuration. As it is the configuration of grain of sands on a beach.

So, what is the function?

You have to define a function, and a level of it that can define it as present or absent in the object we are observing.

What is the object? The starry sky? You mean our galaxy, or at least the part we can observe from our planet?

What is the function?

You have to specify all these things.

Frankly, I cannot see any relevant FI in the configuration of stars. Maybe we can define some function for which a few bits could be computed, but no more than that.

So, as it is your example, plese clarify better.

Fior me, it is rather obvious that none of your examples shows any big value of FI for any possible function, And that includes Art’s tornado, which of course I will discuss separately with him.

Looking forward to your input about that.

Thank you! :slight_smile:

I will come to your tornado as soon as possible. In the meantime, the discussion with Swamidass can maybe help clarify some points. I will come back to the discussion later.

You are anticipating too much. Have patience. I am only saying that the correct model for the RV part of the neo-darwinian model is a random walk. For the moment, I have not considered NS or other aspects.

By the way, the random walk model is also valid for neutral drift because, as sait, it is part of the RV aspect.

As said, my estimate is a good lower threshold. For design inference, that is fine.

I have discussed that. Why do you doubt that it is a reasonable approximation? It is not confused by neutral evolution, why should it? The measurement itself is based on the existence of neutral evolution. Why should that generate any confusion?

I have said that my procedure cannot evaluate functional divergence as separate from neutral divergence. Thereofre, what I get is a lower threshold. And so? What is the problem? As a lower threshold I declare it, and as a lower threshold I use it in my reasonings. Where is the problem?

Of course, as said, I am not considering NS. Yet. I will. But I have already pointed to two big OPs of mine, one for RV and one for NS. You can find a lot of material there, if you have the time.

However, I will come to that. And to the role of NS in generating FI. Just give me time.

But RV is a random system of events. It must be treated and analyzed as such.

I said the information is the positions of visible stars in the sky. The function of this information, for many thousands of years, was navigation (latitude and direction), time-telling (seasons), and storytelling (constellations). Any change that would impact navigation, time-telling, or storytelling, or create a visual difference would impact one or all these things.

There are about 9,000 visible stars in the sky (low estimate). Keeping things like visual acuity in mind (Naked eye - Wikipedia), we can compute the information. However, even if there are just two possible locations in the sky for every star (absurd) and only half the stars are important (absurd), we are still at 4,500 bits of information in the position of stars in the sky. That does not even tell us the region of sky we are looking at (determined by season and latitude), but we can neglect this for now.

1 Like

I wll briefly answer this, and then for the moment I must go.

What makes the current configuration of the stars specific to help navigation, time telling or story telling? If the configuration were a different random configuration, wouldn’t it be equally precious for navigation, time telling and storytelling?

There is no specific functional information in the configuration we observe. Most other copnfigurations generated by cosmic events would satisfy the same functions you have defined.

Yes you did say this. We dispute this claim. Both @sfmatheson, @glipsnort, and I have all explained our objections.

This the crux (or at least one crux) of the issue. We are convinced that neutral evolution will be mistaken as FI gains. You have not put forward any negative controls to quell our objections. See what has already been said:

That last paragraph is key. Your estimate of FI seems to be, actually, FI + NE (neutral evolution), where NE is expected to be a very large number. So the real FI is some number much lower than what you calculated.

2 Likes

I really don’t understand.

Can you please explain why neutral evolution would be part of the FI I measure? This is complete mystery to me.

Neutral evolution explains the conservation of sequences? Why? I really don’t understand.

1 Like

A new configuration would not be equally precious for telling the stories we have now. We would have different constellations, and therefore different myths about these constellations. My function is to tell this specific (specified!) stories, not any old stories you might want to come up with in place of them. So no, a new configuration would break the storytelling function.

Remember also, that some configurations (e.g. a regular grid or a repeating pattern) are useless for navigation or time-telling. Very quickly, we would get over 500 bits with a careful treatment, well into the thousands if not millions of bits.

1 Like

But you are doing exactly what I cautioned about. You are defining the function as a consequence of an already observed configuration.

If the configuration were different, we woukld be telling different stories.

Are you really so confused about the meaning of FI?

The function must be defined independently You can define the function as “telling stories about the stars”. You cannot define the function as “telling storeis about the stars in this specific configuration”.

How can you not understand that this is conceptually wrong?

1 Like

No, I’m just using a particular definition of function, which parallels yours in biology. If you don’t want me to use my definition, I am not sure you can use yours.

It seems you are defining function by the already observed configuration of proteins in extant biology. This does not take into account the configurations that would produce the same high level functions, but we just don’t see because it is not what happened.

If these are the rules, you are breaking them. Right?

It is subjective how we define function. I chose a definition of function that paralleled yours in biology, so I am not sure how you can object to me “breaking the rule” while breaking yourself with your own definition!

Yes, this highlights the problems with using FI as a way of determining if something is designed or not.

2 Likes

When using conservation it’s possible, even likely, to underestimate the total FI present while overestimating the change in FI. When the mouse genome was sequenced, for example, one of the immediate outcomes was a lower bound on the fraction of the genome that is functional (not quite the same thing as FI, but in the same conceptual neighborhood), a bound of 6% based on the fraction of the genome that is conserved. That was a valid conclusion (with various caveats). If we were to repeat the same analysis across primates to humans, we would get a larger fraction, say 8%. That would also be a valid conclusion. What is not a valid conclusion is that the functional fraction of the genome increased by 2% in primates. Some functional sequence is likely to have changed on the branches between rodents and primates without losing function, while other functional sequence has been lost and gained in each branch.

5 Likes

Which is, once again, why it is important to do this analysis with a phylogeny. As @John_Harshman, an expert in this area, comments:

Note that the paper he links too does a great job at the analysis you are attempting to do. It would be valuable to look it over to determine where you disagree, agree, or could learn from it @gpuccio.

Furthermore, I reiterate my question from early on:

And:

2 Likes

Gentlemen, I see my name has been mentioned here. I’m a bit overwhelmed by work, so cannot participate in this discussion, but I do want to make a few clarifying points regarding my own thinking on this problem.

First, as I have stated previously, the quality of any estimation depends upon sufficient sampling, and I greatly doubt that we have sufficient data to estimate the FI required for any protein in a specific species. The same probably goes for genus, and maybe even for family and order. My interest has always been, and continues to be, the FI required for the origin of novel protein families, rather than for a protein in a single species or genus.

For this reason, I have focussed, and continue to focus on protein families that have had the benefit of sufficient sampling produced by thousands of independently evolving populations across a wide range of taxa (preferably across many phyla). In discussions with various people in the field, there seems to be two research questions to answer (which appear to have come up here, though I’ve not read anything other than gpuccio’s one reply included in the email sent to me):

  1. How can we test for sufficient sampling, given common descent and,

  2. How can we test for sufficient sampling, given the possibility of a global maximum fitness in sequence space, clustering our data in only a subsection of sequence space?

For the past 8 months or so I have been working on a method to provide answers to both questions. The method itself was not difficult, but the testing of that method has been time consuming. I cannot discuss anything related to this here, as I am submitting my findings to a journal for peer review and publication. I will say this, however … I wouldn’t even think about estimating the FI for a protein from the data for an individual species (e.g., human), for obvious reasons when one scans the sampling available at present. But my initial assumptions several years ago regarding sampling broadly across phyla or kingdoms is being verified to produce reasonably accurate estimates of FI required for the origin of many protein families.

I can’t say anymore until the paper passes review and is published. The input and critiques I’ve had thus far from a few non-ID scientists sceptical of ID, has been especially valuable, but I cannot widen the circle of discussion any further until after the paper is out. As my former supervisor urged me … “stop leaking your research and focus on submitting more papers for publication.”

As I indicated at the outset, I cannot participate in this discussion, although it does look to be interesting.

4 Likes

Thanks for chiming in @Kirk. Great to see you, even if it is for a moment.

3 Likes