Custom Searchers
While EvoTorch's internally provided SearchAlgorithm classes are effective, we understand that its often desirable to define your own search algorithms, with particular optimisations for your given problem at hand. This tutorial will walk you through the steps to do this.
General Structure¶
A SearchAlgorithm object has quite a simple structure: it requires only an __init__
method that accepts a Problem instance, and a _step
method to iteratively update the state of the algorithm.
from evotorch import Problem
from evotorch.algorithms import SearchAlgorithm
class MySearcher(SearchAlgorithm):
def __init__(self, problem: Problem):
super().__init__(problem)
...
# any additional desired initialisation
...
def _step(self):
...
# Generate a new population
new_population = ...
# Evaluate the new population
self.problem.evaluate(new_population)
...
From only these two methods, the SearchAlgorithm instance will work for any Problem class and any Logger instances.
from evotorch.logging import PandasLogger
searcher = MySearcher(Problem)
pandas_logger = PandasLogger(searcher)
searcher.run(1000)
A Simple GA¶
To demonstrate this general framework will implement the simple GA described in the paper Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning.
This algorithm has the following structure:
- Generate and evaluate an initial population of \(n\) samples.
- While True:
- Select the top \(e\) individuals as the elites.
- Select the top \(k\) individuals as the parents.
- Replace the bottom \(n-e\) individuals in the population with children generated by adding gaussian noise to random parents.
Where generally, \(k > e\), e.g. there are more parents than elites. Note that in the original paper, each elite surviving was evaluated \(10\) times and averaged in each iteration, although for simplicity, we will omit this step.
To begin within, let's define the __init__
function of our custom evolutionary algorithm.
from typing import Optional
from evotorch import Problem
from evotorch.algorithms.searchalgorithm import SearchAlgorithm, SinglePopulationAlgorithmMixin
class SimpleGA(SearchAlgorithm, SinglePopulationAlgorithmMixin):
def __init__(
self,
problem: Problem,
*,
popsize: int, # Total population size n
num_elites: int, # Number of elites that survive each generation e
num_parents: int, # Number of parents from which to generate children
mutation_power: float # Scale of gaussian noise used to generate children
):
# Call the __init__(...) method of the superclass
SearchAlgorithm.__init__(
self,
# Problem to work on:
problem,
# The remaining keyword arguments are for registering
# the status getter methods.
# The result of these status getter methods will
# automatically be shown in the status dictionary.
pop_best=self._get_pop_best,
pop_best_eval=self._get_pop_best_eval
)
SinglePopulationAlgorithmMixin.__init__(
self,
)
# Store the hyperparameters
self._popsize = int(popsize)
self._num_elites = int(num_elites)
self._num_parents = int(num_parents)
self._mutation_power = float(mutation_power)
# Generate the initial population -- note that this uses the problem's initial bounds as a uniform hyper-cube.
self._population = self._problem.generate_batch(self._popsize)
# The following variable stores a copy of the current population's
# best solution
self._pop_best: Optional[nxg.Solution] = None
Creating the searcher will simply store all of the hyper-parameters, and create an initial population by sampling the uniform hyper-cube provided by problem.initial_bounds
.
It is worth noting the status methods that we registered when we called super().__init__
:
Registering these methods will automatically registered the status values 'pop_best'
and 'pop_best_eval'
in the searcher.status
dictionary for use by loggers. The former tracks the best individual in the population, whereas the latter tracks the fitness of the best individual in the population. Then we simply need to define the two getter functions:
def _get_pop_best(self):
return self._pop_best
def _get_pop_best_eval(self):
return self._pop_best.get_evals()
and we can immediately access these values using
To complete the definition of our searcher, we need to define the _step
function.
def _step(self):
# If this is the very first iteration, this means that we have an unevaluated population.
if 'iter' in self.status:
# Evaluate the population
self.problem.evaluate(self._population)
# Sort the population
self._population = self._population[self._population.argsort()]
# Select the parents.
parents = self._population[:self._num_parents]
# Pick a random parent for each child
num_children = self._popsize - self._num_elites
parent_indices = self.problem.make_randint(num_children, n = self._num_parents)
parent_values = parents.values[parent_indices]
# Add gaussian noise
child_values = parent_values + self._mutation_power * self.problem.make_gaussian(num_children, self.problem.solution_length)
# Overwrite all the non-elite solutions with the new generation
self._population.access_values()[self._num_elites:] = child_values
# Evaluate and sort the new population
self.problem.evaluate(self._population)
self._population = self._population[self._population.argsort()]
# Store a copy of the best solution, for reporting
# and analysis purposes
self._pop_best = self._population[0].clone()
and finally we require a property population
to be defined for the SinglePopulationAlgorithmMixin
to access for logging.
Demonstration on Neuroevolution¶
With our custom EA defined, we're ready to run a simple experiment. For this we will use the classic Cartpole problem, with a single-layer MLP policy.
from evotorch.neuroevolution import GymNE
prob = GymNE(
env_name='CartPole-v1',
network="""
Linear(obs_length, 16)
>> Tanh()
>> Linear(16, act_length)
>> Tanh()
""",
initial_bounds=(-0.0001, 0.0001),
num_episodes=10,
num_actors = 1,
)
We can instantiate our searcher as we would any other EvoTorch SearchAlgorithm instance,
from evotorch.logging import PandasLogger, StdOutLogger
ga = SimpleGA(
prob,
popsize=40,
num_elites=2,
num_parents=10,
mutation_power=0.02
)
StdOutLogger(ga)
pandas_logger = PandasLogger(ga)
and run it for as many iterations as we like using the run
method. Let's try 30 generations.
Observing the best_eval
status over time using the pandas_logger
instance we created earlier shows steady progress,
Output
And, unless you've been particularly unlucky in training, you can render the agent
center_solution = ga.status["pop_best"]
policy_net = prob.to_policy(center_solution)
while True:
result = prob.visualize(policy_net)
print('Visualised episode has cumulative reward:', result['cumulative_reward'])
to see a nice behaviour like: