Don’t know if this is what you had in mind, but this is what I’ve got for you.
What are you looking at? I’ve simulated the X-Men Danger Room, where the X-Men train to learn how to work together as a team. They have to defeat all the Sentinels (robots firing blue lasers) using a sequence of actions encoded in an X-Gene. They practice over and over, making random changes to their X-Gene each time.
In the upper left room, Cyclops (shoots red optic blasts) is training by himself. His starting X-Gene already encodes a full solution to this scenario, but Cyclops is a dedicated team leader and so he practices over and over to make sure he can repeat the process. A new X-Gene only replaces the old X-Gene if it is still a full solution. Over time, you may observe some drift in his X-Gene - there is some redundancy in the genetic code, and some phenotype changes are also neutral.
In the upper right room, Cyclops is joined by three teammates: Jean Grey (telekinesis that deactivates Sentinels), Iceman (makes ice blocks that melt), and Jubilee (shoots short-range fireworks). This Cyclops also starts with the same X-Gene encoding a full solution to this room. That’s good, because while his teammates start with the same X-Gene (think gene duplication), they start in locations that make them totally useless; they are by-standers. Collectively, they are subject to the same selection; they collectively update their X-Genes only if the new versions also represent a full solution. Initially, that means Cyclops’ X-Gene is under strong selection while the others’ just drift.
However, it is possible that over time, the teammates will start to chip in. At that point, you may observe more changes in Cyclops’ X-Gene getting fixed. And some of those changes may mean his X-Gene no longer encodes a full solution all on his own. That’s where the bottom (smaller) four rooms come in. They show what happens if each of the four teammates had to use their current X-Genes (from the upper right room) in solo exercises. No selection is applied based on the performance in these rooms; they are just there to monitor whether Cyclops’ X-Gene still encodes a full solution to this scenario.
That page will loop for 250 generations, then replay the last functional solution. It takes 30-40 minutes on my computer. You can:
- Watch it and see what happens. You may see Cyclops become dependent on his teammates or you may not; this is a random walk that does not select for that outcome.
- Let it run and check it at the end. The animation will “freeze” on the last frame when it is done. If the little Cyclops in the small, lower left room still has Sentinels in there with him, then the team X-Genes from the upper right have evolved such that they are collectively functional but Cyclops’ X-Gene is not functional by itself.
- Go here and watch one such result I already found for you: Danger Room
If you choose option 3, you’ll see the end result of a lineage where first Jubilee’s X-Gene mutated so that she started to attack one of the Sentinels, and then Cyclops’ X-Gene mutated so that some of his shots go sideways instead of down at that Sentinel. Neither of those outcomes was specifically selected for; they were obtained with option 2.
If anyone tries 1 or 2 and gets an interesting result, you can copy the Best Genome text and send it to me, and I can make a page that will animate that result.
Beyond my wildest dreams @AndyWalsh. This to be developed a bit more, so that it is easier to understand, runs faster, and save runs.
Absolutely, all of this and more if it is to be at all useful. At this point I consider it a proof of concept, that the system can exhibit the behavior of interest.
And clearly the idiosyncrasies of the X-Men conceit will make it more appealing for some, while introducing unnecessary complexity that needs to be explained for others.
I sent it to my geneticist colleague (who is very much into comics and sci-fi) and he said:
Ok that is awesome!
Yes Jordan thanks to this post I’ve spent the past several hours on this forum. I can already tell this is going to eat up a huge amount of time, but it’s going to be fun.
With some necessary documentations, improvements to the code, and some no X-men variants, I suggest the interested develop this into a publication for this journal:
I’m game, although I’ll need to spend more time familiarizing myself with the types of papers they are looking for. Already based on the “Aims and scope,” I gather actual classroom implementation is encouraged/required.
Fair enough. At the core is just a game where one or more players have to clear a room of enemies using projectile attacks, not any more complicated than, say, Space Invaders. Players can move up/down or left/right, and attack in any of those directions. Player actions are deterministically encoded as a sequence of these
Apart from the complexity, there are also obviously intellectual property considerations. I am not at all confident that fair use allowances for educational use would cover this scenario, and Disney is pretty notorious for vigorously defending their IP. If this is to go beyond a toy I tinker with for a little while, a rebranding is clearly in order. Something like “Doctor W’s Risk Salon” perhaps.
Here are some results. For all of these experiments, the same enemy configuration is used, the one seen in the web animations. This means scores range from 0 to 12. It takes 3 hits to destroy each of 4 enemies; each hit is worth a point.
First question - before we get to the constructive neutral evolution, is it possible for a one-player solution to adaptively evolve from a random starting point? I ran 64 trials; for each trial, a random starting position in the room is chosen and used throughout the trial, and a single player is given a random sequence of actions. The player then reproduces with random substitutions. If the offspring gets the same score or better than its parent, the offspring gets to reproduce; otherwise the parent reproduces again. So, simple hill-climbing optimization; no crossover or recombination.
The chart below shows the distribution of scores from the random starting conditions, and then the max scores observed within each trial over groups of 64 generations. The mean +/- 1sd over all trials is shown with a red dot & bars; the median is the blue dot. Within 256 generations, the max score of 12 can be achieved, but most trials seem “stuck” at 9 even out to 2048 generations.
Is this the sort of problem that is difficult to solve with hill-climbing? No, as it turns out I just needed to give the player more actions. If I double the genome length from 128bp to 256bp (64 actions to 128), then more than half the trials get to the max score by 512 generations.
I also looked to see if the random starting conditions had an effect on the eventual highest score; in other words, did a player have to start with a sequence that scored some points to get to the max of 12? I did not see a clear relationship between starting score and maximum observed score, even with the shorter sequence where presumably the starting conditions matter more.
(Need to take a break. I’ll be back with some results on constructive neutral evolution.)
It seems like you should make everyone cyclops. The other guys are not that strong. You should also score by avoidance of taking damage. You could also score by time to completion.
The observation that making the genomes bigger helps is important.
Doing a trial that adds a new sentinal, showing that evolved system adapts quicker then the initial solution would be cool.
Right, so constructive neutral evolution. Now we take the same game and the same enemy configuration and we introduce a multiplayer component. We also eliminate the random starting conditions; each trial begins with 4 players in the same positions and with the same starting set of moves. This is to guarantee that we start from the situation where 1 player (‘Cyclops’) can get a perfect score (12) by himself and the other 3 players contribute nothing to the score.
We follow the same hill-climbing algorithm as before, only we are starting at the top of the hill. Offspring with mutations that result in less-than-perfect scores cannot reproduce. To stretch the metaphor a bit, we aren’t so much climbing the hill as exploring the hilltop to see if it is broad and flat like a mesa or pointy like a mountain. (Or maybe jumping from one hilltop to another of equal height, across a crevice. Again, our 3D geography analogy does not map perfectly to a high dimensional discrete space.) Offspring are scored based on 4-player performance in the game, but we also separately run each individual player through the game to see what kind of score their current action sequence gets by themselves. Those individual scores are what we are looking at, but the hill-climbing/selection criteria does not consider them.
The chart below shows the distribution of individual players scores for the starting conditions (12 for Cyclops, 0 for everyone else), and the distribution of individual player scores in subsequent 64-generation buckets for generations where the team score was 12. That’s because we know Cyclops’ score can go down if the whole team score goes down, but we want to know what happens when we require the team score to stay at 12.
As you can see, over time some of the other players (Iceman and Jubilee) start to be able to score points, while Cyclops loses the ability to get a perfect score on his own. Jean Grey does not have an ‘attack’ that allows her to score points by herself, but only to help keep her teammates alive so they can score points. We could track the inverse of what we track here–the team score when we remove just one player–to see if Jean Grey ever becomes essential.
Notice that we don’t see any of the other players get to a point where they can get a perfect score alone. This could be due to only have a 48bp (24 action) genome, but we see something similar if we expand to 256bp (128 actions). This does provide an opportunity for a higher maximum, but the central tendency remains pretty balanced over the long run. I don’t know how generalizable that finding is, but I think it’s interesting.
You might also notice that Iceman doesn’t tend to do as well as Jubilee. This is likely because the range of his attack is lower than hers, so he has to spend more actions getting closer to each enemy. We can eliminate the variability in attacks by giving everyone the Cyclops attack; the figure below shows that the overall tendency towards distributed contributions is similar.
Very nice. Shows that irreducible complexity is virtually guaranteed by CNE.
If you start then all with the same genome in the Sam location, you can simulate duplication then divergence. You should see different copies specializing in different sentinels.
Can you get this to run fast in a jupyter notebook, without saving to disk?
His attack has unlimited range, which means he doesn’t have to walk as far. Variability in attacks is a wrinkle of trying to simulate the X-Men specifically that is probably unhelpful for most questions. But down the road there might be something interesting to test there. Or something like the range could be a characteristic of a player that also evolves.
So actually in the solo player scenario the fitness was a combination of score (defeating enemies) and health to see if it was possible to evolve a full-health solution (answer: yes). But I then I was trying to simplify my explanations and analysis, so I focused on score.
Yes. Ultimately there might need to be a tradeoff here. Wandering around shooting randomly for long enough will ultimately yield a perfect score. Games generally have time limits, and in biology there are obviously costs to replicating longer genomes.
Yes, this is one of my next goals, to see if I can characterize whether multiplayer solutions adapt more quickly to changing conditions. My concern is how to appropriately introduce a cost of additional players. Similar to lengthening the genome, stuff the room with enough players and they’ll complete the room in one step by sheer volume.
The point is that as long as they aren’t shooting each other there is no cost to add more. Friendly fire takes care of the trade-off.
Oh God, please no! I hate that guy.
Actually thinking about this more, I think you need a really python library that allows students to construct scenarios, run them, and display results.
I can see this being a really good educational tool, which allows them to ask and answer questions prompted by the teacher, and also (better yet) asked by themselves.
You haven’t explained much about the population parameters. How many individuals per generation? What types of mutation do you allow? Recombination?
@mung see this.
Definitely. If my biology teacher taught us evolution through X-Men and video games I’d listen a lot more attentively.
I can certainly do that. In that scenario, each player would start being able to independently get a perfect score, which as you say is comparable to duplication and divergence. I was originally thinking about recruitment of initially nonfunctional components.
Yes, if everyone has to maintain a certain level of health. If only some players have to be left standing at the end, then you can spam the room with lots of players firing wildly.