A Simple Genetic Algorithm Written in Ruby

I recently started making my way through An Introduction to Genetic Algorithms by Melanie Mitchell. I’m not too far into it yet but I’ve been pleasantly surprised by how clearly the author explains each concept.

In the first chapter she outlines a simple algorithm that is the basis for most applications of genetic algorithms. There isn’t any code in the book so I decided to implement it on my own in Ruby to understand it better:

POPULATION_SIZE = 24
NUM_BITS = 64
NUM_GENERATIONS = 1000
CROSSOVER_RATE = 0.7
MUTATION_RATE = 0.001
class Chromosome
attr_accessor :genes
def initialize(genes = "")
if genes == ""
self.genes = (1..NUM_BITS).map{ rand(2) }.join
else
self.genes = genes
end
end
def to_s
genes.to_s
end
def count
genes.count
end
def fitness
genes.count("1")
end
def mutate!
mutated = ""
0.upto(genes.length1).each do |i|
allele = genes[i, 1]
if rand <= MUTATION_RATE
mutated += (allele == "0") ? "1" : "0"
else
mutated += allele
end
end
self.genes = mutated
end
def &(other)
locus = rand(genes.length) + 1
child1 = genes[0, locus] + other.genes[locus, other.genes.length]
child2 = other.genes[0, locus] + genes[locus, other.genes.length]
return [
Chromosome.new(child1),
Chromosome.new(child2),
]
end
end
class Population
attr_accessor :chromosomes
def initialize
self.chromosomes = Array.new
end
def inspect
chromosomes.join(" ")
end
def seed!
chromosomes = Array.new
1.upto(POPULATION_SIZE).each do
chromosomes << Chromosome.new
end
self.chromosomes = chromosomes
end
def count
chromosomes.count
end
def fitness_values
chromosomes.collect(&:fitness)
end
def total_fitness
fitness_values.inject{|total, value| total + value }
end
def max_fitness
fitness_values.max
end
def average_fitness
total_fitness.to_f / chromosomes.length.to_f
end
def select
rand_selection = rand(total_fitness)
total = 0
chromosomes.each_with_index do |chromosome, index|
total += chromosome.fitness
return chromosome if total > rand_selection || index == chromosomes.count1
end
end
end
population = Population.new
population.seed!
1.upto(NUM_GENERATIONS).each do |generation|
offspring = Population.new
while offspring.count < population.count
parent1 = population.select
parent2 = population.select
if rand <= CROSSOVER_RATE
child1, child2 = parent1 & parent2
else
child1 = parent1
child2 = parent2
end
child1.mutate!
child2.mutate!
if POPULATION_SIZE.even?
offspring.chromosomes << child1 << child2
else
offspring.chromosomes << [child1, child2].sample
end
end
puts "Generation #{generation} – Average: #{population.average_fitness.round(2)} – Max: #{population.max_fitness}"
population = offspring
end
puts "Final population: " + population.inspect

view raw
gistfile1.rb
hosted with ❤ by GitHub

The algorithm as described in the book is quoted below with my comments following each section.

1. Start with a randomly generated population of n l-bit chromosomes (candidate solutions to a problem).

In this code, n is represented by the POPULATION_SIZE constant and l by NUM_BITS. The initial randomly generated population is created using the Population’s seed! method.

2. Calculate the fitness f(x) of each chromosome x in the population.

The fitness of a Chromosome can be calculated by its fitness method. In this example, the fitness is simply the number of 1’s that the chromosome contains.

3. Repeat the following steps until n offspring have been created:

a. Select a pair of parent chromosomes from the current population, the probability of selection being an increasing function of fitness. Selection is done “with replacement,” meaning that the same chromosome can be selected more than once to become a parent.

A chromosome is chosen from the population using the Population’s select method. This method implements fitness-proportionate selection using roulette-wheel sampling which is conceptually equivalent to giving each individual a slice of a circular roulette wheel equal in area to the individual’s fitness. The roulette wheel is spun, the ball comes to resent on one wedge-shaped slice, and the corresponding individual is selected.

b. With probability Pc (the “crossover probability” or “crossover rate”), cross over the pair at a randomly chosen point (chosen with uniform probability) to form two offspring. If no crossover takes place, form two offspring that are exact copies of their respective parents.

The crossover probability is defined by CROSSOVER_RATE. The crossover is performed by the Chromosome’s bitwise AND (&) operator which I overloaded for this class.

c. Mutate the two offspring at each locus with probability Pm (the mutation probability or mutation rate), and place the resulting chromosomes in the new population. If n is odd, one new population member can be discarded at random.

The mutation rate is defined by the MUTATION_RATE constant. The mutation is performed by the Chromosome’s mutate! method.

4. Replace the current population with the new population.

5. Go to step 2.

Here are the results for 1,000 generations where the population is 24, the number of bits per chromosome is 64, the mutation rate is 0.001, and the crossover rate is 0.7. Pay attention to the average value over time:

Generation 1 - Average: 32 - Max: 40
Generation 2 - Average: 33 - Max: 38
Generation 3 - Average: 34 - Max: 41
Generation 4 - Average: 34 - Max: 37
Generation 5 - Average: 34 - Max: 41
Generation 6 - Average: 35 - Max: 39
Generation 7 - Average: 34 - Max: 43
Generation 8 - Average: 35 - Max: 45
Generation 9 - Average: 36 - Max: 44
Generation 10 - Average: 37 - Max: 44
...
Generation 990 - Average: 49 - Max: 50
Generation 991 - Average: 49 - Max: 50
Generation 992 - Average: 49 - Max: 51
Generation 993 - Average: 49 - Max: 51
Generation 994 - Average: 49 - Max: 51
Generation 995 - Average: 49 - Max: 50
Generation 996 - Average: 49 - Max: 50
Generation 997 - Average: 48 - Max: 50
Generation 998 - Average: 48 - Max: 50
Generation 999 - Average: 48 - Max: 50
Generation 1000 - Average: 48 - Max: 50

The average fitness of the population increases from 32 initially to 48 by the 1,000th generation, just as you’d expect for a population slowly becoming more fit over time.

If this interests you, I encourage you to try it yourself and experiment with different values for MUTATION_RATE, POPULATION_SIZE, and NUM_BITS.

For example, increasing the mutation rate from 0.001 (1 in 1,000) to 0.01 (1 in 100) has the following effect on the average fitness over time:

Given the impact that the mutation rate has on long term fitness of the population, how can we determine what the optimal mutation rate is? Can the chromosome’s mutation rate change over time? Can different sections of the chromosome evolve to have different mutation rates? What about the number of bits or the population size? These are a few of the things I hope to find out :)

Beware of The Good Idea Fairy

5goodideafairyWhen I was in the Air Force I had one assignment where I worked closely with a lieutenant colonel who was a self-proclaimed “good idea fairy”.

Every time we had a meeting he would interrupt to tell us about some idea he had just thought of that might be able to help with what we were working on.

The conversations would go something like this:

Us: The bus was half an hour late to pick up the technicians today because it had a flat tire and the driver had to go pick up another vehicle.

Him: Every technician should have his own vehicle so that we never have to deal with this again.

Each time he suggested a half thought-out idea, we would have to take time to consider it and the consequences of implementing it.

Us: Does every technician really need his own vehicle?

Us: Do the technicians have a government drivers license so that they can drive the vehicle?

Us: Does transportation have enough vehicles to lend out to one or more of the technicians?

Us: If we give the technicians a vehicle, other teams may want a vehicle of their own as well. Do we really want to go down this route?

And so on and so on. It’s not that his solution was terrible, but that a little bit of extra consideration would have led him to realize that there were a lot of practical problems with it. Instead, we were constantly being side-tracked by discussions about whether or not his ideas had merit. Most of the time, because they were not well thought-out, they didn’t.

He took pride in being a “good idea fairy” and every now and then he did have good solution, but most of the time his poorly considered solutions caused us to waste more time than they saved.

When you are holding a meeting, you want to encourage participation from the attendees and if you are constantly shooting down someone’s ideas, it might discourage others from speaking up in the future. Because of his rank and position and because we didn’t want to discourage people from speaking up, I don’t think anyone ever spoke to him about it, but I’ve always thought back on that when an idea pops into my head in the middle of a discussion.

The question then is how do you encourage creative solutions but also cut down on the number of Good Idea Fairy ideas? I don’t know — maybe you can’t. Maybe in order to discover gems you have to sift through a lot of dirt. Maybe that’s just part of the collaborative problem solving process.

What do you think? Is there a way to cut down on the number of Good Idea Fairy ideas while still encouraging people to speak up when they have a potential solution to a problem?

How to Switch Your Cedar Heroku App from Webrick to Thin

Received an en email today from Cody K on my use of Webrick in production for Lean Domain Search:

Hey Matt,

Lean Domain Search is really neat – thanks for making it available.

Just wanted to let you know that it’s a really bad idea to run Ruby web apps with Webrick in production – you should be using something like nginx and either Unicorn or Passenger. Those are battle-hardened and production-ready, whereas Webrick’s sole purpose is for development environments, and, as such, is not heavily tested (if at all) for security issues and whatnot.

Just a friendly heads up. :) Thanks again.

Heroku’s Rails 3 docs are also pretty clear on this and I remember reading it during my initial upgrade, but never got around to actually doing it.

Thin is a recommended app server over Webrick (the default for rails).

Switching to Thin is pretty easy:

1. Add gem 'thin' to your Gemfile:

I added it to the production group because I want to keep using Webrick in development for now.

2. Make sure that you run bundle install to update your Gemfile otherwise you’ll receive a warning like this when you push to Heroku:

You are trying to install in deployment mode after changing
your Gemfile. Run `bundle install` elsewhere and add the
updated Gemfile.lock to version control.
You have added to the Gemfile:
* thin

3. Add a Procfile to your root directory to instruct Heroku to use Thin instead of Webrick:

4. Commit and push your updated app to Heroku.

If all went well, you should see something like this in your logs:

2012-01-18T12:44:33+00:00 app[web.1]: >> Using rack adapter
2012-01-18T12:44:33+00:00 app[web.1]: >> Thin web server (v1.3.1 codename Triple Espresso)
2012-01-18T12:44:33+00:00 app[web.1]: >> Listening on 0.0.0.0:38951, CTRL+C to stop
2012-01-18T12:44:33+00:00 app[web.1]: >> Maximum connections set to 1024
2012-01-18T12:44:34+00:00 heroku[web.1]: State changed from starting to up

Too easy.

One thing confused me though: How did Cody know that I was using Webrick? I emailed him and asked.

His response:

I mistakenly typed some bogus characters in at the end of my address bar while I was on your site…something like this: http://www.leandomainsearch.com/search?q=up^jf

Sure enough, Webrick fails loudly when the URL includes a caret:

Touché.