The basic purpose of genetic algorithms (GAs) is optimization. Since optimization problems arise
frequently, this makes GAs quite useful for a great variety of tasks. As in all optimization problem, we are faced with the problem
of maximizing/minimizing an objective function f(x) over a given space X of arbitrary dimension.
A brute force which would consist in examining every possible x in X in order to determine the
element for which f is optimal is clearly infeasible. GAs give a heuristic way of searching
the input space for optimal x that approximates brute force without enumerating all the elements
and therefore bypasses performance issues specific to exhaustive search.
Underlying idea:
To each chromosome xi, we assign a fitness value which is nothing but f(xi). Stronger individuals, that is those chromosomes with fitness values closer to the colony optimal will have greater chance to survive across epochs and to reproduce than weaker individuals which will tend to perish. In other words, the algorithm will tend to keep inputs that are close to the optimal in the set of inputs being considered (the colony) and discard those that under-perform the rest.
The crucial step in the algorithm is reproduction or breeding that occurs once per epoch.
The content of the two chromosomes participating in reproduction are literally merged together
to form a new chromosome that we call (guess what?) a child. This heuristic allows us to possibly combine the best of both individuals to yield a better one (evolution).
Moreover during each epoch, a given fraction of the organisms is allowed to mutate (yes, we also have parthenogenesis). This provides a degree of randomness which allows us to span the whole input
space by generating individuals with partly random genes.
As mentioned earlier, each epoch ends with the deaths of inapt organisms. We eliminate inputs
exhibiting bad performance compared to the overall group. This is based on the assumption that
they're less inclined to give birth to strong individuals since they have poor quality genes and
that therefore we can safely disregard them (selection).
The algorithm:
Every input x in X is an integer vector x=(x1,x2,...,xn). For the sake of simplicity,
assume 0<=xi<=k for i=1...n. In order to implement our genetic algorithm for optimizing
f, we first need to encode each input into a chromosome. We can do it by having log(k)
bits per component and directly encoding the value xi (figure 1). Each bit will be termed gene. Of course, we may choose any other encoding based on our requirements and the problem at hand.
At epoch 0, we generate (possibly randomly) an initial set of inputs in X. Then at each epoch i, we perform fitness evaluation, reproduction, mutation and selection. The algorithm stops when a specified criterion providing an estimate of convergence is reached.
Remarks:
In order to have fine grained control over the computation, we have to adjust
parameters such as colony size, rate of reproduction, rate of death... Obviously
these must be set empirically in order to fine tune the performance of the GA.
We can do even better by incorporating such parameters in each chromosome so that
optimal values may be found by the GA itself. For example, instead of using a global value,
we might have a rate of reproduction or crossover specific to each chromosome,
determining its probability of mating.
A major problem with most optimization technique is hill climbing: The algorithm
gets trapped into a local maximum and stops when the convergence criterion is
reached. Mutation which introduces randomness in the method allows us somehow,
to avoid or at least reduce this undesirable effect.
For further information on genetic algorithms and related issues, see:
We will first select a certain number of inputs, say, x1,x2 ... xn belonging to the input
space X. In the GA terminology, each input is called an organism or chromosome.
The set of chromosomes is designated as a colony or population. Computation is done over epochs. In each epoch the colony will grow and evolve according to specific rules reminiscent of biological evolution.
Now that we've outlined the basic principles, let's examine in further detail how this whole
process is accomplished and how the algorithm works in practise. Let's take the example of optimizing a function f over a space X contained in Nd.
figure 1:
Reproduction:
At each epoch, we choose a set of chromosomes belonging to the
population that will mate. We choose to call such individuals
females. Each female chooses a random set of potential partners
and mates with the fittest of the group (this is another way of
achieving selection). Once two organisms have been chosen
for crossover, we merge their genetic information in order
to create a new organism. The split position is
determined randomly.
Mutation:
A new organism is created by randomly modifying some of its genes.
This can be done right after reproduction on the newly created child
or as a separate process.
Death:
Worst performers among the colony are given a high probability of
dying at the end of each epoch. We may also consider eliminating
old chromosomes (NB: The highest performer is immune from death from
old age).
NovaGenetica
: collection of links and information on GA.