Steepest-Ascent Hill-Climbing
Search algorithms have a tendency to be complicated. Genetic algorithms have a lot of theory behind them. Adversarial algorithms[1] have to account for two, conflicting agents. Informed search relies heavily on heuristics. Well, there is one algorithm that is quite easy to grasp right off the bat.
Imagine you are at the bottom of the hill; you have no idea where to go. A decent place to start would be to go up the hill to survey the landscape. Then, restart to find a higher a peak until you find the highest peak, right? Well, that is the entire algorithm.
Let's dig a bit deeper.
An Introduction
What is Steepest-Ascent Hill-Climbing, formally? It's nothing more than an agent searching a search space, trying to find a local optimum. It does so by starting out at a random Node, and trying to go uphill at all times.
The pseudocode is rather simple:
current ← Generate-Initial-Node()
while true
neighbors ← Generate-All-Neighbors(current)
successor ← Highest-Valued-Node(neighbors)
if Value-At-Node(successor) <= Value-At-Node(current):
return current
current ← successor
What is this Value-At-Node
and $f$-value mentioned above? It's nothing more than a heuristic value that used as some measure of quality to a given node. Some examples of these are:
- Function Maximization: Use the value at the function $f(x, y, \ldots, z)$.
- Function Minimization: Same as before, but the reciprocal: $1 / f(x, y, \ldots, z)$.
- Path-Finding: Use the reciprocal of the Manhattan distance.
- Puzzle-Solving: Use some heuristic to determine how well/close the puzzle is solved.
The best part? If the problem instance can have a heuristic value associated with it, and be able to generate points within the search space, the problem is a candidate for Steepest-Ascent Hill-Climbing.
Implementing Steepest-Ascent Hill-Climbing
For this problem, we are going to solve an intuitive problem: function maximization. Given a function $z = f(x, y)$, for what values of $x, y$ will $z$ be the largest? To start, we are going to use a trivial function to maximize:
$$
z = -x^2 - y^2
$$
We see it is nothing more than a paraboloid. Furthermore, since it is infinite, we are going to restrict the domain to ${ x, y \in \mathbb{Z}^+ : -100 \leq x, y \leq 100 }$; therefore, we only have integer values between $(-100, 100)$.

So, let's begin.
The Representation
Because we will be searching throughout a search space, we will need some representation of a state. For our particular problem instance, it's very easy: the points $(x, y)$. Also, we will need to represent the $f$ value, so we create an auxiliary class as well.
class Node:
"""A node in a search space (similar to a point (x, y)."""
def __init__(self, x, y):
self.x = x
self.y = y
class Function:
"""A function and its respective bounds."""
def __init__(self, function, x_bounds, y_bounds):
...
def __call__(self, node):
...
@property
def x_bounds(self):
"""Get the x bounds of the function.
Returns:
tuple<int, int>: The x bounds of function in the format (min, max).
"""
...
@property
def y_bounds(self):
"""Get the y bounds of the function.
Returns:
tuple<int, int>: The y bounds of function in the format (min, max).
"""
...
That will be all that we need for our purposes.
Steepest-Ascent Hill-Climbing
As we saw before, there are only four moving pieces that our hill-climbing algorithm has: a way of determining the value at a node, an initial node generator, a neighbor generator, and a way of determining the highest valued neighbor.
Starting with the way of determining the value at a node, it's very intuitive: calculate the value $z = f(x, y)$.
class HillClimber:
"""A steepest-ascent hill-climbing algorithm."""
def __init__(self, function):
self.function = function
def _value_at_node(self, node):
return self.function(node)
The initial node can simply be taken as a random $(x, y)$ in their respective bounds.
def _initial_node(self):
x = randint(self.function.x_bounds[0], self.function.x_bounds[1])
y = randint(self.function.y_bounds[0], self.function.y_bounds[1])
return Node(x, y)
Generating neighbors is actually quite simple as well: because our domain is limited to integers, we can simply look at the four cardinal directions (and make sure we won't be breaking the bounds when we do). Also, we randomize the neighbors, to make things more interesting[2].
def _generate_all_neighbors(self, node):
x, y = node.x, node.y
nodes = [Node(x, y)]
if x < self.function.x_bounds[1]:
nodes.append(Node(x + 1, y))
if x > self.function.x_bounds[0]:
nodes.append(Node(x - 1, y))
if y < self.function.y_bounds[1]:
nodes.append(Node(x, y + 1))
if y > self.function.x_bounds[0]:
nodes.append(Node(x, y - 1))
shuffle(nodes)
return nodes
Finally, to get the highest value node, it's fairly straightforward:
def _highest_valued_node(self, neighbors):
max_point = neighbors[0]
for point in neighbors[1:]:
if self._value_at_node(point) > self._value_at_node(max_point):
max_point = point
return max_point
Piecing all this together, we get our Steepest-Ascent Hill-Climber:
def climb(self):
current_node = self._initial_node()
while True:
print("Exploring Node({}, {})".format(current_node.x, current_node.y))
neighbors = self._generate_all_neighbors(current_node)
successor = self._highest_valued_node(neighbors)
if self._value_at_node(successor) <= self._value_at_node(current_node):
return current_node
current_node = successor
Does it work? Exactly as planned.
Exploring Node(5, -88)
...
Exploring Node(5, -67)
...
Exploring Node(5, -47)
...
Exploring Node(5, -27)
...
Exploring Node(5, -4)
Exploring Node(4, -4)
Exploring Node(3, -4)
Exploring Node(3, -3)
Exploring Node(2, -3)
Exploring Node(2, -2)
Exploring Node(1, -2)
Exploring Node(1, -1)
Exploring Node(1, 0)
Exploring Node(0, 0)
However, this was too easy. We had a function with one local optimum. Let's make things interesting.
Optimizing Steepest-Ascent Hill-Climbing
Suppose we keep our previous domain, but we change our function to the following:
$$z = -(x^2 + y^2) + x\\, y\\, \cos x \sin y $$
This function isn't quite as intuitive to visualize, please reference the figure. Essentially, it’s what we had before, but thousands of local optimum when we get further from the center. Our previous Hill-Climbing would absolutely get destroyed by that function.

To alleviate this, we are going to use two optimizations:
- Instead of taking the steepest uphill move, we are going to simply take a random, uphill move (known as Stochastic Hill-Climbing).
- When we get stuck, we are going to restart the search (known as Hill-Climbing With Restarts).
Stochastic Hill-Climbing
Updating the algorithm is fairly simply, all the previous mechanics are inheritable, just swap out _highest_valued_node
with a stochastic version.
class StochasticHillClimber(HillClimber):
"""A stochastic steepest-ascent hill-climbing algorithm."""
def _get_random_uphill_move(self, current_node, neighbors):
uphill_nodes = []
for point in neighbors:
if self._value_at_node(point) > self._value_at_node(current_node):
uphill_nodes.append(point)
return current_node if len(uphill_nodes) == 0 else choice(uphill_nodes)
def climb(self):
current_node = self._initial_node()
while True:
print("Exploring Node({}, {})".format(current_node.x, current_node.y))
neighbors = self._generate_all_neighbors(current_node)
successor = self._get_random_uphill_move(current_node, neighbors)
if self._value_at_node(successor) <= self._value_at_node(current_node):
return current_node
current_node = successor
Running this algorithm, we get better results; but we can do better.
Stochastic Hill-Climbing With Restarts
For this, we simply have to restructure the climb
function to handle generational effects (like keeping the max valued node throughout generations). Not too difficult.
class StochasticHillClimberWithRestarts(StochasticHillClimber):
"""A stochastic steepest-ascent hill-climbing algorithm with restarts."""
def climb(self, number_of_generations):
max_node = self._initial_node()
for generations in range(number_of_generations):
current_node = self._initial_node()
while True:
print("Generation {}, Exploring Node({}, {}), Current Max Node({}, {})".format(generations, current_node.x, current_node.y, max_node.x, max_node.y))
neighbors = self._generate_all_neighbors(current_node)
successor = self._get_random_uphill_move(current_node, neighbors)
if self._value_at_node(max_node) < self._value_at_node(current_node):
max_node = current_node
if self._value_at_node(successor) <= self._value_at_node(current_node):
break
current_node = successor
return max_node
How did this one fare? Quite better than all the rest. Let's take a look at what the exploration process looked like.

Marvelous, some got to the top, many got caught in local optimum. A global-optimum was found. A success.
Source Code
The source code be found here.
- Algorithms used in games, where a player searches for an optimal move against an opponent. ↩︎
- If the neighbors are always generated deterministically, there might occur a sequence of ties when generating the highest-valued node. We randomize the neighbors so a random piece will be chosen in the tie-breaker. ↩︎
Let's propose an Evolutionary Algorithm experiment; say we already have a framework in place (like the Secret Message framework we previously implemented). How difficult would it be to completely switch problem instances?
First, we need another problem instance. Our previous problem instance was pretty straightforward: it had one local optimum. Let's take on a problem with many local optimum, such as the 3-SAT problem.
The premise of 3-SAT is simple. From a global pool of variables ($x_1$, $x_2$, $\ldots$, $x_n$), we have a basic clause of three variables or-ed together (signified by $\vee$):
$$
x_p \vee x_q \vee x_r
$$
Then, and (signified by a $\wedge$) several clauses together:
$$\left(x_p \vee x_q \vee x_r\right) \wedge \left(x_s \vee x_t \vee x_u\right) \wedge \ldots \wedge \left(x_v \vee x_w \vee x_y\right)$$
The only stipulation is that any variable can be negated (signified by a $\neg$). So, supposing we want to negate $x_p$; $x_s$ and $x_u$; and $x_v$, $x_w$, and $x_y$; we can do the following:
$$
\left(\neg x_p \vee x_q \vee x_r\right) \wedge \left(\neg x_s \vee x_t \vee \neg x_u\right) \wedge \ldots \wedge \left(\neg x_v \vee \neg x_w \vee \neg x_y\right)
$$
Now, we simply have to assign all the variables such that all the clauses will evaluate to true. It may sound simple, but it belongs to the hardest classes of problems in computer science. There is no guaranteed algorithm to produce the right answer at this time.
For a more visual approach, please reference the figure below. The goals is to make every inner node green, by having at lease one, connected, outer node be green. Note the green nodes have to account for negation as well.

This sounds like a good problem candidate for an Evolutionary Algorithms[1].
The SAT Problem
We can skip over the problem specific parts to worry more about the Evolutionary Algorithm aspect. Suppose we already have a well-defined SAT
class that takes care of SAT-specific properties and methods, like so:
class SAT:
def __init__(self, filename):
"""Create a SAT object that is read in from a CNF file."""
...
@property
def variables(self):
"""Get *all* the variables."""
...
@property
def total_clauses(self):
"""Set the total number of clauses."""
...
@property
def clauses_satisfied(self):
"""Get the number of satisfied clauses."""
...
def __getitem__(self, key):
"""Get a particular variable (key)"""
...
def __setitem__(self, key, value):
"""Set a variable (key) to value (True/False)"""
...
From this, we can create a new genotype for our Individual
.
The New Genotype
The genotype structure was very similar to what we had before:
- The genotype is the
SAT
problem we defined above. - Fitness is defined by a percentage of the total satisfied clauses.
- Mutation is uniform, choose a percentage $p$ of alleles and flip their value.
- Recombination is uniform, randomly assemble values from both parents.
Looking at the refactoring, not much has changed.

The New EA Framework
Now that we have updated our Individual, next thing to updated would be the Evolutionary Algorithm framework, including:
- The Population
- The EA Itself
Except, we don't have to.
That's the beauty of Evolutionary Algorithms, they are incredibly adaptable. By swapping out the Individual, the rest of the evolutionary algorithm should still work.
For our SAT problem, there were some parameters updated, to make the algorithm more efficient:
- The mutation rate has been reduced to 5%
- The tournament size has been reduced to 15 individuals (out of $\lambda = 100$).
The Result
So, let's try our Evolutionary Algorithm. Taking a SAT instance with 75 variables and 150 clauses, this makes the search space
$$
2^{75} \approx 3.77 \times 10^{22}
$$
Great, so roughly 1,000 times the grain of sand on Earth, easy. So, can our EA do it?
After roughly 100 iterations, yes. See the visualization below.

Marvelous, our EA managed to find a solution after only 100 iterations in a giant search space. And all we had to do was swap out one class.
The Source Code
All source code can be found here.
- In reality, it's not a great candidate for an evolutionary algorithm. The gradient is sometimes murky, because flipping one variable's value can drastically decrease/increase the fitness function. Also, there are several great heuristics for solving the SAT problem. ↩︎
This is an extenuation of a blog post I wrote a couple months ago. You can find it here.
One of the big takeaways in my introduction to Evolutionary Algorithm was the sheer number of numerical parameters.
- $\mu$ And $\lambda$
- Mutation Rate
- $k$ in k-Tournament Selection
Not only this, but the sheer number of parameters:
- The genotype
- The mutator operator
- The survivor selection algorithm
And one might be wondering, what is the best operator for $x$ or $y$? Let’s look at an example.
Recall the problem from the previous discussion: We are going to consider a sample problem, a deciphering program. The premise of the problem is such:
There is a string of characters (without spaces) hidden away that, after set, is inaccessible.
There are two ways to retrieve data about the hidden message:
1: Get the length of the string.
2: Given a string, the problem will output how many characters match within the two strings.
Disregarding the other technical details, let us focus on the survivor selection. We used $k$ -tournament selection (with $k = 50$). But, let’s run a little experiment:
Run the Evolutionary Algorithm, with $k$ ranging from $5$ (basically the bare minimum) to $100$ ($\lambda$, the population size), and see how fast the algorithm terminates. Do this 1,000 times to get accurate results.
The result?

This makes sense. Our problem has one local optimum: the actual solution. So we do not need a lot of genetic diversity, we need aggressive selective pressure[1] to reach the top quickly.
As $k$ gets closer to $\mu$, the average termination time decreases. What does this tell us? We picked the wrong survivor selection algorithm.
With $k = \mu$, we no longer have $k$ -tournament selection; we have truncation selection (where only the most fit individuals survive). And that's the interesting part about Evolutionary Algorithms: there are no objective, best parameters.
How do we alleviate this? Trial and error. There is no telling when one parameter is going to perform better than another.
A after a couple of trial runs, and objectives in mind (average terminating fitness, best terminating fitness, time to termination), the answer might surprise (and delight) you.
- How elitist the survivor selection algorithm is, picking the strongest individuals more often. ↩︎
Get to work. Grab a cup of coffee, some water. Possibly breakfast.
Sit down at my desk. Log into my computer, looks at i3wm.
Launch a new terminal session. ranger
. Enter. Navigate to project. nvim
. Enter.
space f
. Use fzf to open necessary file.
This is a daily workflow for me. Before I start typing code, I use about four tools to get to it. Notice, I haven’t mentioned touching the mouse once yet; Because I haven't touched the mouse yet.
Throughout my academic and professional career as a software engineer, I spent a lot of time using tools. Trying new tools. Searching for tools. Customizing tools[1].
Although this might be a commendable feat, it is often met with skepticism. A lot of skepticism.
I’ve had at least a hundred conversations with people about tools, about why they advocate against using such tools. These conservations were with other students, senior software engineers, and tenured professors; and generally, almost all of them fell into three, distinct buckets of people.
It’s a waste of time.
There is by far the worst yet best counterargument against investing so heavily into tools; and the reasoning is in the trailing, three words.
waste. Better tools will objectively help one write more code. If there is a modification that needs to be made at the end of the file, it takes one keystroke to jump to that positions in Vim. Picture doing this, but 100s of time a day.
of time. There is a time commitment, a big one. And there is a learning curve, and it is steep. So, one would have to justify the reasons for learning it. And I’ve done so below.
What I have works just fine.
I’m deeply skeptical of the “If it’s not broken, don’t fix it” mentality. There’s always room for improvement, and there’s nothing more I love to see than improvement.
It’s why we buy new phones. It’s why we like nice, new cars. It’s why we, as civilization, love progress.
Even if it’s a marginal increase, say 5% more code a week. With a 100 lines of code written a week, that can be a about 250 additional lines of code written a year. A marginal increase, sure; but a welcomed one.
It doesn’t really make you a better programmer.
I generally have three core reasons against this, and they are as follows.
It’ll help write more code, faster. It is objectively faster to type more code if your hands never leave the keyboard. It’s faster to navigate a filesystem without having to focus on where a cursor is. It’s ridiculously faster to open a file to just type out parts of the filename apposed to navigating to it.
It’s healthier. Legitimately. Minimizing hand travel will significantly put less strain on your hands, wrists, and arms.
It’ll make you a better programmer. As programmers, we have to learn things all the time. New programming languages, new paradigms, new company software. Learning a new tool keeps a mind sharp by reinforcing how to learn something. Plus, the skill is fluid. Learning to navigate in vim will help navigate in ranger. Learning regular expressions can be used in fuzzy searches. The skill are transferable.
So, where might one start? Again, there are three areas I see that can have the most benefit from improving your workflow.
Your operating system. Whether it’s Windows, MacOS, or Linux, you have to use an operating system. Get good at it. Learn the paradigms of the system. Learn how the file explorer works in Windows. Learn advanced features in MacOS like a spotlight alternative or simple hot corners. Or learn how to set up your WiFi drivers in Linux.
Your text editor. Choose one’s that portable. I use NeoVim. Many people have success with Emacs. Some people prefer a visual editor like Sublime Text or Atom. Learn their workflows. Learn the shortcuts of Emacs. Learn how to edit text in Normal Mode using Vim. Install packages in Sublime or Atom. You spend most of your time here anyway, might as well get familiar with it.
The intermediary layer between your operating system and your text editor. By this, I usually mean the shell, like Bash or ZSH. Learn how to grep
or navigate around. Install packages to make tasks easier. It’s well worth it.
So, yes, I am a strong advocate of using advanced tools. Established tools. Great tools. And I hope I’ve convinced you to do the same.
If you've ever written a shell script, you might top your script with your interpreter of choice (i.e. #!/bin/python
or #!/bin/bash
). What you might not have known that rm
is also a valid interpreter. Take the following script.
#!/bin/rm
echo "Hello, world"
It would simply interpret to be removed. Useful? No. Hilarious? Yes.
We are well aware of Euler’s famous number,
$$
e = 2.71828182845
$$
There are also quite a few ways of deriving Euler’s Number. There’s the Taylor expansion method:
$$
e = \sum _{k=0} ^{\infty} \frac{1}{k!}
$$
There is also the classical limit form:
$$
e = \lim_{n \rightarrow \infty} \left( 1 + \frac{1}{n} \right)^n
$$
Then there is another. Let $R$ be a random number generated between $[0, 1]$, inclusive. Then $e$ is the average of the number of $R$ s it takes to sum greater than $1$.
With more rigor, for uniform $(0,, 1)$ random independent variables $R_1$, $R_2$, $\ldots$, $R_n$,
$$
N = \min \left\{n: \sum_{k=1}^n R_k > 1 \right\}
$$
where
$$e = \text{Average}(n)$$
The proof can be found here, but it’s pretty math-heavy. Instead, an easier method is to write a program to verify for large enough $n$.
n | Sum Solution | Limit Solution | Random Uniform Variable |
---|---|---|---|
1 | 1 | 2 | 2 |
10 | 1.7182818011 | 2.5937424601 | 2.5 |
100 | 1.7182818284 | 2.7048138294 | 2.69 |
1000 | 1.7182818284 | 2.7169239322 | 2.717 |
10000 | 2.7181459268 | 2.7242 | |
100000 | 2.7182682371 | 2.71643 | |
1000000 | 2.7182804690 | 2.71961 | |
10000000 | 2.7182816941 | 2.7182017 | |
100000000 | 2.7182817983 | 2.71818689 | |
1000000000 | 2.7182820520 | 2.718250315 |
Source Code
def e_sum(upper_bound):
if upper_bound < 1:
return 0
return Decimal(1.0) /
Decimal(math.factorial(upper_bound)) + \
Decimal(e_sum(upper_bound - 1))
def e_limit(n):
return Decimal((1 + 1.0 / float(n))**n)
def find_greater_than_one(value=0, attempts=0):
if value <= 1:
return find_greater_than_one(value + random.uniform(0, 1), attempts + 1)
return attempts
Arguable the first (and most successful) problem solver we know of is Evolution. Humans (along with other species) all share a common problem: becoming the best at surviving our environment.
Just as Darwinian finches evolved their beaks for to survive different parts of the Galápagos Islands, we too, evolved to survive different parts of the world. And we can program a computer to do the same.
Evolution inspired a whole generation of problem solving, commonly known as Evolutionary Algorithms (EAs). EAs have been known for solving (or, approximating) solutions to borderline unsolvable problems. And, just as the mechanics of evolution are not that difficult, the mechanics of EAs are just the same.
Today, we will build an Evolutionary Algorithm from the ground-up.
An Introduction
Before we proceed with implementation or an in-depth discussion, first we wish to tackle two questions: what is an Evolutionary Algorithm, what does an Evolutionary Algorithm look like, and what problems can Evolutionary Algorithms solve.
What Is An Evolutionary Algorithm?
An Evolutionary Algorithm is generic, population-based optimization algorithm that generates solution via biological operators. That is quite a mouthful, so let’s break it up.
Population-based. All Evolutionary Algorithms start by creating a population of random individuals. These individuals are just like an individual in nature: there is a genotype (the genes that make up an individual) and a phenotype (the result of the genotype interacting with the environment). In EAs, they would be defined as follows:
- Genotype The representation of the solution.
- Phenotype The solution itself.
Because it’s a little confusing to think of it this way, it’s often better to think about it in terms of a genotype space and a phenotype space.
- Genotype Space The space of all possible combinations of genes.
- Phenotype Space The space of all possible solutions.
Don’t worry if this doesn’t make sense, we’ll touch on it later.
Optimization Algorithm. Evolution is an optimization algorithm. Given an environment, it will try to optimize an individual for that environment with some fitness metric. Evolutionary Algorithms operate the same way.
Given an individual, it will try to optimize it. We do not use a literal environment, but still use a fitness metric. The fitness metric is simply a function that takes in the genotype of the individual, and outputs a value that is proportional to how good a solution is.
Because fitness metrics are proportional to how good a solution is, this implies a very important condition for our phenotype space: it’s a gradient.
Biological operators. Evolutionary Algorithms are inspired by biology and evolution. Just as biology has operators to generate new individuals, so do Evolutionary Algorithms. More on that later.
Generic. Evolutionary Algorithms are generic. When a framework has been introduced, it can be reused on an individual basis (provided it has the appropriate crossover and mutator operators).
What Does An Evolutionary Algorithm Look Like?
The pseudocode for an Evolutionary Algorithm is one we might expect evolution to have, generate a random population, generate offspring, and let survival of the fittest do its job. And so it does:
BEGIN
INITIALISE population with random solutions
WHILE ( TERMINATION CONDITION is satisfied ) DO
SELECT parents
RECOMBINE pairs of parents
MUTATE the resulting offspring
EVALUATE new candidates
SELECT individuals for the next generation
OD
END
What Problems Can Evolutionary Algorithms Solve?
Evolutionary Algorithms can solve any problem that has a genotype that can fit within our framework:
- The genotype can have a crossover operator.
- The genotype can have a mutator operator.
- The genotype can map to a definite fitness function.
Again, the fitness should be proportional to how good a solution is. If the fitness function $f(x)$ is bounded by $0 \leq f(x) \leq 100$, 0 should be the worst solution or no solution, and $100$ should be the best solution (or vice versa, for inverted fitnesses).
Implementing An Evolutionary Algorithm
We will be implementing a special class of Evolutionary Algorithm, referred to as a (μ + λ)-Evolutionary Strategy. The name is not important, but μ and λ will be; we will come back to them shortly.
For the purposes of our discussion, we are going to consider a sample problem: a deciphering program. The premise of the problem is such.
- There is a string of characters (without spaces) hidden away that, after set, is inaccessible.
- There are two ways to retrieve data about the hidden message:
- Get the length of the string.
- Given a string, the problem will output how many characters match within the two strings.
The secret message would look as follows:
class SecretMessage:
def __init__(self, message):
"""Initialize a Secret Message object.
:message (str): The secret message to hide.
"""
self.__message = message
def letters_match(self, message):
"""Determine how many characters match the secret message.
Note:
The message length and the secret message length must be the same length (accessed via the length property).
:message (str): The message to compare the secret message to.
:returns (int): The number of characters matched.
"""
return sum(self.__message[char] == message[char] for char in range(len(message)))
@property
def length(self):
"""Get the length of the secret message.
:returns (int): The length of the secret message.
"""
return len(self.__message)
Not too complicated.
An Individual
In Evolutionary Algorithms, an individual is simply a candidate solution. An individual has a genotype (the representation) and operators (Crossover, Mutation, and Fitness) that act on the genotype. We will discuss them more extensively below.
The Genotype
As aforementioned, a genotype is the representation of an individual. Just as DNA can for humans, knowing the genotype can give you all the information one might need to determine the characteristics of an individual.
Because a genotype must be acted upon a crossover and mutation operator, there are few common choices for genotypes:
- Vectors[1]. A vector is common because crossover is trivial, take elements from the two genotypes to create a new individual. Mutation is also trivial, pick random elements in vector, and mutate them. Often, for a complex enough individual, a vector of bits is used[2].
- Matrices. Same as a vector, but with multiple dimensions.
- Float-Point or Real Numbers. This one is tricky, but commonly used. There are a plethora of ways to recombine two numbers: average of the two numbers, bit manipulation, binary encoding crossover. Same can be said for mutation: adding a random value to the number, bit manipulation with a random value, or bit flipping in binary encoding. It should be noted that some of these introduce biases, and one should account for them.
- Trees. Some problems can be easily broken down into trees (like an entire programming language can be broken down into a parse tree). Crossover is trivial, swap a random subtree with another. Mutation, however, is often not used; this is because the crossover itself acts as a mutation operator.
Next, our genotype must be initialized to some random values. Our initial population is seeded with said randomly-generated individuals, and with a good distribution, they will cover a large portion of the genotype space.
Keeping all this in mind, let us think about the representation of our problem. A string is nothing more than a vector of characters, so using the first bullet point, we are given our operators pretty easily.
Here’s what our genotype would look like:
class Individual:
message = SecretMessage("")
def __init__(self):
"""Initialize an Individual object.
Note:
Individual.message should be initialized first.
"""
length = Individual.message.length
characters = [choice(ascii_letters) for _ in range(length)]
self.genotype = "".join(characters)
Crossover, Mutation, and Fitness
As aforementioned, to fit within an Evolutionary Algorithm framework, a genotype must be created with crossover, mutator, and fitness operators. Although we have covered said operators, we will formalize them here.
- Crossover. A crossover operator simply takes in two genotypes, and produces a genotype that is a mixture of the two. The crossover can be uniform (random elements from both genotypes are taken), 1-point (take a pivot position between two points, the left half is one genotype, and the right half another), and $N$-point (same as 1-point but with multiple pivot positions).
- Mutator. A mutator operator takes random values within the genotype and changes them to a random values. There is a mutation rate that is associated with all genotypes, we call it $p$. $p$ is bounded such that $0 \leq p \leq 1.0$, where $p$ is the percentage of the genotype that gets mutated. Careful to limit this value, however; too high $p$ can result in just a random search.
- Fitness. The fitness operator simply takes in a genotype, and outputs a numerical value proportional to how good a candidate solution is. Fitness has no limits, and can be inverted (i.e., a smaller fitness is better).
For the purposes of our program, we are going to have the following operators: crossover will be uniform, mutation will be a fixed number of mutating characters, and fitness will be a percentage of the characters matched.
class Individual
...
@property
def fitness(self):
"""Get the fitness of an individual. This is done via a percentage of how many characters
in the genotype match the actual message.
:return (float): The fitness of an individual.
"""
return 100 * Individual.message.letters_match(self.genotype) / Individual.message.length
@staticmethod
def mutate(individual, rate):
"""Mutation operator --- mutate an individual with a specified rate.
This is done via a uniform random mutation, by selecting random genes and swapping them.
Note:
rate should be a floating point number (0.0 < rate < 1.0).
:individual (Individual): The individual to mutate.
:rate (float): The rate at which to mutate the individual's genotype.
"""
number_of_characters_to_mutate = int(rate * individual.message.length)
genotype_list = list(individual.genotype) # Strings are immutable, we have to use a list
for _ in range(number_of_characters_to_mutate):
character_to_mutate = choice(range(individual.message.length))
genotype_list[character_to_mutate] = choice(ascii_letters)
individual.genotype = "".join(genotype_list)
@staticmethod
def recombine(parent_one, parent_two):
"""Recombination operator --- combine two individuals to generate an offspring.
:parent_one (Individual): The first parent.
:parent_two (Individual): The second parent.
:returns (Individual): The combination of the two parents (the offspring).
"""
new_genotype = ""
for gene_one, gene_two in zip(parent_one.genotype, parent_two.genotype):
gene = choice([gene_one, gene_two])
new_genotype += gene
individual = Individual()
individual.genotype = new_genotype
return individual
A Population
Now that we have an Individual, we must create a Population. The Population holds the candidate solutions, creates new offspring, and determine which are to propagate into further generations.
μ And λ
Remember when we mentioned that μ and λ would be important in our Evolutionary Algorithm? Well, now here they come into play. μ And λ are defined as follows:
- μ: The population size.
- λ: The number of offspring to create.
Although these are simple constants, they can have a drastic impact on an Evolutionary Algorithm. For example, a Population size of 1,000 might find a solution in much fewer generations than 100, but will take longer to process. It has been experimentally shown that a good proportion between the two is:
$$
λ / μ \approx 6
$$
However, this is tested for a large class of problems, and a particular Evolutionary Algorithm could benefit from having different proportions.
For our purposes, we will pick $μ = 100$ and $λ = 15$, a proportion just a little over 6.
class Population:
def __init__(self, μ, λ):
"""Initialize a population of individuals.
:μ (int): The population size.
:λ (int): The offspring size.
"""
self.μ, self.λ = μ, λ
self.individuals = [Individual() for _ in range(self.μ)]
Generating Offspring
Generating offspring is trivial with the framework we imposed on an Individual: pick two random parents, perform a crossover between the two to create a child, mutate said child, and introduce the child back into the population pool.
In code, it would look as follows:
class Population:
...
@staticmethod
def random_parents(population):
"""Get two random parents from a population.
:return (Individual, Individual): Two random parents.
"""
split = choice(range(1, population.individuals[0].message.length))
return choice(population.individuals[:split]), choice(population.individuals[split:])
@staticmethod
def generate_offspring(population):
"""Generate offspring from a Population by picking two random parents, recombining them,
mutating the child, and adding it to the offspring. The number of offspring is determine by
λ.
:population (Population): The population to generate the offspring from.
:returns (Population): The offspring (of size λ).
"""
offspring = Population(population.μ, population.λ)
offspring.individual = []
for _ in range(population.λ):
parent_one, parent_two = Population.random_parents(population)
child = Individual.recombine(parent_one, parent_two)
child.mutate(child, 0.15)
offspring.individuals += [child]
return offspring
Survivor Selection
The last core part of an Evolutionary Algorithm would be survival selection. This puts selective pressure on our candidate solutions, and what ultimately leads to fitter solutions.
Survivor selection picks μ Individuals that would be the best to propagate into the next generation; however, it’s not as easy as picking the fittest μ Individuals. Always picking the μ best Individuals leads to premature convergence, a way of saying we “got a good solution, but not the best solution”. The Evolutionary Algorithm simply did not explore the search space enough to find other, fitter solutions.
There are a number of ways to run a survival selection, one of the most popular being $k$ -tournament selection. $k$ -tournament selection picks $k$ random Individuals from the pool, and selects the fittest Individual from the tournament. It does this μ times, to get the full, new Population. The higher $k$, the higher the selective pressure; however, also the higher chance of premature convergence. The lower $k$, the less of a chance of premature convergence, but also the more the Evolutionary Algorithm starts just randomly searching.
At the bounds, $k = 1$ will always be just a random search, and $k = μ$ will always be choosing the best μ individuals.
We choose $k = 25$, giving less fit solutions a chance to win, but still focusing on the more fit solutions.
@staticmethod
def survival_selection(population):
"""Determine from the population what individuals should not be killed. This is done via
k-tournament selection: generate a tournament of k random individuals, pick the fittest
individuals, add it to the survivors, and remove it from the original population.
Note:
The population should be of size μ + λ. The resultant population will be of size μ.
:population (Population): The population to run survival selection on. Must be of size μ + λ.
:returns (Population): The resultant population after killing off unfit individual. Will be
of size μ.
"""
new_population = Population(population.μ, population.λ)
new_population.individuals = []
individuals = deepcopy(population.individuals)
for _ in range(population.μ):
tournament = sample(individuals, 25)
victor = max(tournament, key=lambda individual: individual.fitness)
new_population.individuals += [victor]
individuals.remove(victor)
return new_population
The Evolutionary Algorithm
As with the pseudocode in the introduction, this will exactly resemble our Evolutionary Algorithm. Because the Individual and the Population framework is established, it is almost a copy-paste.
class EA:
def __init__(self, μ, λ):
"""Initialize an EA.
:μ (int): The population size.
:λ (int): The offspring size.
"""
self.μ, self.λ = μ, λ
def search(self):
"""Run the genetic algorithm until the fittness reaches 100%.
:returns: The fittest individual.
"""
generation = 1
self.population = Population(self.μ, self.λ)
while self.population.fittest.fitness < 100.0:
offspring = Population.generate_offspring(self.population)
self.population.individuals += offspring.individuals
self.population = Population.survival_selection(self.population)
generation += 1
Running An Evolutionary Algorithm
Below, we have an instance of the evolutionary algorithm searching for a solution:
Now, looking at the string “FreneticArray”, it has 13 characters, and seeing as there are 26 letters in the alphabet, double that for lowercase/uppercase letters, our search space was:
$$
\left(2 * 26\right)^{13} \approx 2.0 \times 10^{22}
$$
Huge.
On average, our EA 29 generations to finish[3]. As each generations had at most 115 individuals, we can conclude on average we had to generate:
$$
29 * 115 = 3335\, \text{solutions}
$$
Much smaller than $2.0 \times 10^{22}$. That is what Evolutionary Algorithms are good for: turning a large search space into a much smaller one.
Although there are much more advanced topics in Evolutionary Algorithms, this is enough start implementing your own. With just the simple operators listed above, a genotype, a search space that has a gradient, many problems can solved with an Evolutionary Algorithm.
The Source Code
All source code can be found here.
One of the most valuable skills I have learned in the past year is undoubtedly LaTeX — so much so that I have moved almost all of my notetaking and document writing to LaTeX or Pandoc.
However, with almost any markup language, I did run into friction: constantly importing headers and defining commands I used on an every-document basis. Some commands have become muscle memory; when I got compiler errors every time I forgot to redefine them, I became quite frustrated. So I caved: I made a preamble that I pre-appended to all of my LaTeX docs.
\RequirePackage[l2tabu, orthodox]{nag}
\documentclass[12pt]{article}
\usepackage{amssymb,amsmath,verbatim,graphicx,microtype,units,booktabs}
\usepackage[margin=10pt, font=small, labelfont=bf, labelsep=endash]{caption}
\usepackage[colorlinks=true, pdfborder={0 0 0}]{hyperref}
\usepackage[utf8]{inputenc}
\usepackage{xcolor}
\newcommand{\shellcmd}[1]{\texttt{\colorbox{gray!30}{#1}}} % highlight shell or code commands
\newcommand{\todo}[2]{\textbf{\colorbox{red!50}{#1}}{\footnote{#2}}} % placeholder for something to do
\newcommand{\gray}{\rowcolor[gray]{.9}} % format a table row to make it more visible
\usepackage[left=1.15in, right=1.15in]{geometry}
\usepackage{titleps}
\newpagestyle{main}{
\setheadrule{.4pt}
\sethead{\sectiontitle}
{}
{Illya Starikov}
}
\pagestyle{main}
% *-------------------- Modify This As Needed --------------------* %
% \usepackage[top=0.5in, bottom=0.5in, left=0.5in, right=0.5in]{geometry}
% \pagestyle{empty}
\title{\todo{Title}{Insert title here.}}
\date{\today}
\author{Illya Starikov}
Then, I took it one step further. I occasionally make practice tests for students in Calculus and Physics, so I decided to make a command that acts as extensions of my preamble for tests.
\usepackage{lastpage,ifthen}
\newboolean{key}
\setboolean{key}{true} % prints questions and answers
% \setboolean{key}{false} % prints questions only
\newcommand{\question}[2]{\item (#1 Points) #2}
\newcommand{\answer}[1]{\ifthenelse{\boolean{key}}{#1}{}}
% This must be filled out
\newcommand{\points}{0} % total number of points for the exam
\newcommand{\capped}{0\ } % the capping score, to offer a point or two bonus
\newcommand{\difference}{0\ } % the difference of the two
\newcommand{\timeallotted}{0 hour} % duration of the exam
\begin{document}
\section*{\answer{Key}}
Complete the following problems.
In order to receive full credit,
please show all of your work and justify your answers.
You do not need to simplify your answers unless specifically instructed to do so.
You have \timeallotted.
This is a closed-book, closed-notes exam.
No calculators or other electronic aids will be permitted.
If you are caught cheating, you will receive a zero grade for this exam.
Please check that your copy of this exam contains \pageref{LastPage} numbered pages and is correctly stapled.
The sum of the max points for all the questions is \points,
but note that the max exam score will be capped at \capped
(i.e., there is \difference bonus point(s), but you can't score more than 100\%).
Best of luck.
\begin{enumerate}
\question{}{}
\answer{}
\end{enumerate}
\end{document}