Saturday, December 13, 2008

Genetic programming and stuff

So a few days ago, I was shown this page. It takes an image as input. Basic, but probably somewhat flawed explanation: the output starts out as a number of invisible polygons, and at each step of the algorithm a random attribute of a random polygon is randomly changed. If this looks more like the source image than does the last mutant image, this result is kept and more changes are made to it. This goes on and on and on. The results can be pretty amazing if you let it run for an hour or two.

Now, I suck at Python, and this implementation was written in Javascript, which I have little understanding of, so I can't learn all that much from the actual code. In a book I have on Ruby, Ruby by Example: Concepts and Code, though, I recalled a section on a somewhat similar evolutionary program that is meant to operate on strings. The concept is similar: a target string is given, and a candidate string of the same length is randomly generated.
In my implementation (which I'm currently working on), a mutation is made to a random character in the candidate string, and if that mutant's fitness (mathematically, 1 - (current difference between target and candidate) / (maximum difference)) the target is higher than that of the current candidate, we make that mutant the candidate. Eventually, depending on the length of the string and how my random number generator's feeling today, it will reach the target string. For observation's sake, we'll output the current candidate every X number of steps.

Now, I am not a computer science major nor have I read that much on it, so I'm not that up on terminology and concepts. This is an interesting concept to me, though. It's a decent analog of biological evolution, but it's a bit oversimplified. For one thing, in biological systems detrimental mutations do sometimes propogate. This model instead throws out every mutation that is not beneficial. Another thing is that this model forces mutations to occur at every step, whereas they're far less common in nature.
Also, this scheme only simulates asexual reproduction, in which only one parent is needed for reproduction. Most animals we're familiar with, though, use sexual reproduction, which involves two parents. In the former, you only get one set of DNA to mess around with. With the latter, though, genetic diversity (and thus chances of a better match) increases as you have two separate sets of genes being mixed and matched. This could be implemented in a way similar to that of the first string-evolution program, except we would start with two distinct random strings, and make mutations to each of those. At each stage of reproduction, however, instead of making changes, we force the two candidates to "breed": take the DNA of each, and randomly assign a weight of 1 to an attribute (in this case, letter) of one parrent and a weight of 0 to the corresponding attribute of the other parent. X number of offspring are produced, and a random mutation is made to each one. The two best mutant offspring are then mated and we continue the process.

Now, my idea of simulating sexual reproduction in this way might be a bit faulty, but if it is I'll find that out soon on my own. Also, it's a bit gross (and, if were we dealing with biological organisms with some recessive detrimental mutations, error-prone): since the offspring that mate are from the same parents, it's pretty much cyber-inbreeding.

Anyways, those are some of my ideas. Feel free to think they're stupid or to not understand them, but in the meantime, I'll try and finish up my implementation of the asexual string evolution.

Toodles for now.

3 comments:

  1. Are punctuation and digits going to be included?

    ReplyDelete
  2. No, the strings will be stripped of all non-alphabetic characters and then made lowercase. The current fitness algorithm is based on each letter in the candidate string's distance from each letter in the target string.
    Could probably expand the algorithm to compare ASCII values of all types of characters rather than just lower case letters.

    ReplyDelete

Please be civil. Or if you're going to be uncivil, don't hold back. It's more entertaining that way.