Index
Problem types for neuroevolution
baseneproblem
¶
gymne
¶
This namespace contains the GymNE
class.
GymNE (NEProblem)
¶
Representation of a NEProblem where the goal is to maximize
the total reward obtained in a gym
environment.
Source code in evotorch/neuroevolution/gymne.py
class GymNE(NEProblem):
"""
Representation of a NEProblem where the goal is to maximize
the total reward obtained in a `gym` environment.
"""
def __init__(
self,
env: Optional[Union[str, Callable]] = None,
network: Optional[Union[str, nn.Module, Callable[[], nn.Module]]] = None,
*,
env_name: Optional[Union[str, Callable]] = None,
network_args: Optional[dict] = None,
env_config: Optional[Mapping] = None,
observation_normalization: bool = False,
num_episodes: int = 1,
episode_length: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
alive_bonus_schedule: Optional[tuple] = None,
action_noise_stdev: Optional[float] = None,
num_actors: Optional[Union[int, str]] = None,
actor_config: Optional[dict] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
initial_bounds: Optional[BoundsPairLike] = (0.00001, 0.00001),
):
"""
`__init__(...)`: Initialize the GymNE.
Args:
env: The gym environment to solve. Expected as a Callable
(maybe a function returning a gym.Env, or maybe a gym.Env
subclass), or as a string referring to a gym environment
ID (e.g. "Antv4", "Humanoidv4", etc.).
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network policy whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
Note that this network can be a recurrent network.
When the network's `forward(...)` method can optionally accept
an additional positional argument for the hidden state of the
network and returns an additional value for its next state,
then the policy is treated as a recurrent one.
When the network is given as a callable object (e.g.
a subclass of `nn.Module` or a function) and this callable
object is decorated via `evotorch.decorators.pass_info`,
the following keyword arguments will be passed:
(i) `obs_length` (the length of the observation vector),
(ii) `act_length` (the length of the action vector),
(iii) `obs_shape` (the shape tuple of the observation space),
(iv) `act_shape` (the shape tuple of the action space),
(v) `obs_space` (the Box object specifying the observation
space, and
(vi) `act_space` (the Box object specifying the action
space). Note that `act_space` will always be given as a
`gym.spaces.Box` instance, even when the actual gym
environment has a discrete action space. This because `GymNE`
always expects the neural network to return a tensor of
floatingpoint numbers.
env_name: Deprecated alias for the keyword argument `env`.
It is recommended to use the argument `env` instead.
network_args: Optionally a dictlike object, storing keyword
arguments to be passed to the network while instantiating it.
env_config: Keyword arguments to pass to `gym.make(...)` while
creating the `gym` environment.
observation_normalization: Whether or not to do online observation
normalization.
num_episodes: Number of episodes over which a single solution will
be evaluated.
episode_length: Maximum amount of simulator interactions allowed
in a single episode. If left as None, whether or not an episode
is terminated is determined only by the `gym` environment
itself.
decrease_rewards_by: Some gym env.s are defined in such a way that
the agent gets a constant reward for each timestep
it survives. This constant reward can also be called
"survival bonus". Such a rewarding scheme can lead the
evolution to local optima where the agent does nothing
but does not die either, just to collect the survival
bonuses. To prevent this, it can be desired to
remove the survival bonuses from each reward obtained.
If this is the case with the problem at hand,
the user can set the argument `decrease_rewards_by`
to a positive float number, and that number will
be subtracted from each reward.
alive_bonus_schedule: Use this to add a customized amount of
alive bonus.
If left as None (which is the default), additional alive
bonus will not be added.
If given as a tuple `(t, b)`, an alive bonus `b` will be
added onto all the rewards beyond the timestep `t`.
If given as a tuple `(t0, t1, b)`, a partial (linearly
increasing towards `b`) alive bonus will be added onto
all the rewards between the timesteps `t0` and `t1`,
and a full alive bonus (which equals to `b`) will be added
onto all the rewards beyond the timestep `t1`.
action_noise_stdev: If given as a real number `s`, then, for
each generated action, Gaussian noise with standard
deviation `s` will be sampled, and then this sampled noise
will be added onto the action.
If action noise is not desired, then this argument can be
left as None.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
One can also set this as "max", which means that
an actor will be created on each available CPU.
When the parallelization is enabled each actor will have its
own instance of the `gym` environment.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
initial_bounds: Specifies an interval from which the values of the
initial policy parameters will be drawn.
"""
# Store various environment information
if (env is not None) and (env_name is None):
self._env_maker = env
elif (env is None) and (env_name is not None):
self._env_maker = env_name
elif (env is not None) and (env_name is not None):
raise ValueError(
f"Received values for both `env` ({repr(env)}) and `env_name` ({repr(env_name)})."
f" Please specify the environment to solve via only one of these arguments, not both."
)
else:
raise ValueError("Environment name is missing. Please specify it via the argument `env`.")
# Make sure that the network argument is not missing.
if network is None:
raise ValueError(
"Received None via the argument `network`."
"Please provide the network as a string, or as a `Callable`, or as a `torch.nn.Module` instance."
)
# Store various environment information
self._env_config = {} if env_config is None else deepcopy(dict(env_config))
self._decrease_rewards_by = 0.0 if decrease_rewards_by is None else float(decrease_rewards_by)
self._alive_bonus_schedule = alive_bonus_schedule
self._action_noise_stdev = None if action_noise_stdev is None else float(action_noise_stdev)
self._observation_normalization = bool(observation_normalization)
self._num_episodes = int(num_episodes)
self._episode_length = None if episode_length is None else int(episode_length)
self._info_keys = dict(cumulative_reward="avg", interaction_count="sum")
self._env: Optional[gym.Env] = None
self._obs_stats: Optional[RunningStat] = None
self._collected_stats: Optional[RunningStat] = None
# Create a temporary environment to read its dimensions
tmp_env = _make_env(self._env_maker, **(self._env_config))
# Store the temporary environment's dimensions
self._obs_length = len(tmp_env.observation_space.low)
if isinstance(tmp_env.action_space, gym.spaces.Discrete):
self._act_length = tmp_env.action_space.n
self._box_act_space = gym.spaces.Box(low=float("inf"), high=float("inf"), shape=(self._act_length,))
else:
self._act_length = len(tmp_env.action_space.low)
self._box_act_space = tmp_env.action_space
self._act_space = tmp_env.action_space
self._obs_space = tmp_env.observation_space
self._obs_shape = tmp_env.observation_space.low.shape
# Validate the space types of the environment
ensure_space_types(tmp_env)
if self._observation_normalization:
self._obs_stats = RunningStat()
self._collected_stats = RunningStat()
else:
self._obs_stats = None
self._collected_stats = None
self._interaction_count: int = 0
self._episode_count: int = 0
super().__init__(
objective_sense="max", # RL is maximization
network=network, # Using the policy as the network
network_args=network_args,
initial_bounds=initial_bounds,
num_actors=num_actors,
actor_config=actor_config,
subbatch_size=subbatch_size,
device="cpu",
)
self.after_eval_hook.append(self._extra_status)
@property
def _network_constants(self) > dict:
return {
"obs_length": self._obs_length,
"act_length": self._act_length,
"obs_space": self._obs_space,
"act_space": self._box_act_space,
"obs_shape": self._obs_space.shape,
"act_shape": self._box_act_space.shape,
}
@property
def _str_network_constants(self) > dict:
return {
"obs_space": self._obs_space.shape,
"act_space": self._box_act_space.shape,
}
def _instantiate_new_env(self, **kwargs) > gym.Env:
env_config = {**kwargs, **(self._env_config)}
env = _make_env(self._env_maker, **env_config)
if self._alive_bonus_schedule is not None:
env = AliveBonusScheduleWrapper(env, self._alive_bonus_schedule)
return env
def _get_env(self) > gym.Env:
if self._env is None:
self._env = self._instantiate_new_env()
return self._env
def _normalize_observation(self, observation: Iterable, *, update_stats: bool = True) > Iterable:
observation = np.asarray(observation, dtype="float32")
if self.observation_normalization:
if update_stats:
self._obs_stats.update(observation)
self._collected_stats.update(observation)
return self._obs_stats.normalize(observation)
else:
return observation
def _use_policy(self, observation: Iterable, policy: nn.Module) > Iterable:
with torch.no_grad():
result = policy(torch.as_tensor(observation, dtype=torch.float32, device="cpu")).numpy()
if self._action_noise_stdev is not None:
result = (
result
+ self.make_gaussian(len(result), center=0.0, stdev=self._action_noise_stdev, device="cpu").numpy()
)
env = self._get_env()
if isinstance(env.action_space, gym.spaces.Discrete):
result = np.argmax(result)
elif isinstance(env.action_space, gym.spaces.Box):
result = np.clip(result, env.action_space.low, env.action_space.high)
return result
def _prepare(self) > None:
super()._prepare()
self._get_env()
@property
def network_device(self) > Device:
"""The device on which the problem should place data e.g. the network
In the case of GymNE, supported Gym environments return numpy arrays on CPU which are converted to Tensors
Therefore, it is almost always optimal to place the network on CPU
"""
return torch.device("cpu")
def _rollout(
self,
*,
policy: nn.Module,
update_stats: bool = True,
visualize: bool = False,
decrease_rewards_by: Optional[float] = None,
) > dict:
"""Peform a rollout of a network"""
if decrease_rewards_by is None:
decrease_rewards_by = self._decrease_rewards_by
else:
decrease_rewards_by = float(decrease_rewards_by)
policy = ensure_stateful(policy)
policy.reset()
if visualize:
env = self._instantiate_new_env(render_mode="human")
else:
env = self._get_env()
observation = self._normalize_observation(reset_env(env), update_stats=update_stats)
if visualize:
env.render()
t = 0
cumulative_reward = 0.0
while True:
observation, raw_reward, done, info = take_step_in_env(env, self._use_policy(observation, policy))
reward = raw_reward  decrease_rewards_by
t += 1
if update_stats:
self._interaction_count += 1
if visualize:
env.render()
observation = self._normalize_observation(observation, update_stats=update_stats)
cumulative_reward += reward
if done or ((self._episode_length is not None) and (t >= self._episode_length)):
if update_stats:
self._episode_count += 1
final_info = dict(cumulative_reward=cumulative_reward, interaction_count=t)
for k in self._info_keys:
if k not in final_info:
final_info[k] = info[k]
return final_info
@property
def _nonserialized_attribs(self) > List[str]:
return super()._nonserialized_attribs + ["_env"]
def run(
self,
policy: Union[nn.Module, Iterable],
*,
update_stats: bool = False,
visualize: bool = False,
num_episodes: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
) > dict:
"""
Evaluate the policy on the gym environment.
Args:
policy: The policy to be evaluated. This can be a torch module
or a sequence of real numbers representing the parameters
of a policy network.
update_stats: Whether or not to update the observation
normalization data while running the policy. If observation
normalization is not enabled, then this argument will be
ignored.
visualize: Whether or not to render the environment while running
the policy.
num_episodes: Over how many episodes will the policy be evaluated.
Expected as None (which is the default), or as an integer.
If given as None, then the `num_episodes` value that was given
while initializing this GymNE will be used.
decrease_rewards_by: How much each reward value should be
decreased. If left as None, the `decrease_rewards_by` value
value that was given while initializing this GymNE will be
used.
Returns:
A dictionary containing the score and the timestep count.
"""
if not isinstance(policy, nn.Module):
policy = self.make_net(policy)
if num_episodes is None:
num_episodes = self._num_episodes
try:
policy.eval()
episode_results = [
self._rollout(
policy=policy,
update_stats=update_stats,
visualize=visualize,
decrease_rewards_by=decrease_rewards_by,
)
for _ in range(num_episodes)
]
results = _accumulate_all_across_dicts(episode_results, self._info_keys)
return results
finally:
policy.train()
def visualize(
self,
policy: Union[nn.Module, Iterable],
*,
update_stats: bool = False,
num_episodes: Optional[int] = 1,
decrease_rewards_by: Optional[float] = None,
) > dict:
"""
Evaluate the policy and render its actions in the environment.
Args:
policy: The policy to be evaluated. This can be a torch module
or a sequence of real numbers representing the parameters
of a policy network.
update_stats: Whether or not to update the observation
normalization data while running the policy. If observation
normalization is not enabled, then this argument will be
ignored.
num_episodes: Over how many episodes will the policy be evaluated.
Expected as None (which is the default), or as an integer.
If given as None, then the `num_episodes` value that was given
while initializing this GymNE will be used.
decrease_rewards_by: How much each reward value should be
decreased. If left as None, the `decrease_rewards_by` value
value that was given while initializing this GymNE will be
used.
Returns:
A dictionary containing the score and the timestep count.
"""
return self.run(
policy=policy,
update_stats=update_stats,
visualize=True,
num_episodes=num_episodes,
decrease_rewards_by=decrease_rewards_by,
)
def _ensure_obsnorm(self):
if not self.observation_normalization:
raise ValueError("This feature can only be used when observation_normalization=True.")
def get_observation_stats(self) > RunningStat:
"""Get the observation stats"""
self._ensure_obsnorm()
return self._obs_stats
def _make_sync_data_for_actors(self) > Any:
if self.observation_normalization:
return dict(obs_stats=self.get_observation_stats())
else:
return None
def set_observation_stats(self, rs: RunningStat):
"""Set the observation stats"""
self._ensure_obsnorm()
self._obs_stats.reset()
self._obs_stats.update(rs)
def _use_sync_data_from_main(self, received: dict):
for k, v in received.items():
if k == "obs_stats":
self.set_observation_stats(v)
def pop_observation_stats(self) > RunningStat:
"""Get and clear the collected observation stats"""
self._ensure_obsnorm()
result = self._collected_stats
self._collected_stats = RunningStat()
return result
def _make_sync_data_for_main(self) > Any:
result = dict(episode_count=self.episode_count, interaction_count=self.interaction_count)
if self.observation_normalization:
result["obs_stats_delta"] = self.pop_observation_stats()
return result
def update_observation_stats(self, rs: RunningStat):
"""Update the observation stats via another RunningStat instance"""
self._ensure_obsnorm()
self._obs_stats.update(rs)
def _use_sync_data_from_actors(self, received: list):
total_episode_count = 0
total_interaction_count = 0
for data in received:
data: dict
total_episode_count += data["episode_count"]
total_interaction_count += data["interaction_count"]
if self.observation_normalization:
self.update_observation_stats(data["obs_stats_delta"])
self.set_episode_count(total_episode_count)
self.set_interaction_count(total_interaction_count)
def _make_pickle_data_for_main(self) > dict:
# For when the main Problem object (the nonremote one) gets pickled,
# this function returns the counters of this remote Problem instance,
# to be sent to the main one.
return dict(interaction_count=self.interaction_count, episode_count=self.episode_count)
def _use_pickle_data_from_main(self, state: dict):
# For when a newly unpickled Problem object gets (re)parallelized,
# this function restores the inner states specific to this remote
# worker. In the case of GymNE, those inner states are episode
# and interaction counters.
for k, v in state.items():
if k == "episode_count":
self.set_episode_count(v)
elif k == "interaction_count":
self.set_interaction_count(v)
else:
raise ValueError(f"When restoring the inner state of a remote worker, unrecognized state key: {k}")
def _extra_status(self, batch: SolutionBatch):
return dict(total_interaction_count=self.interaction_count, total_episode_count=self.episode_count)
@property
def observation_normalization(self) > bool:
"""
Get whether or not observation normalization is enabled.
"""
return self._observation_normalization
def set_episode_count(self, n: int):
"""
Set the episode count manually.
"""
self._episode_count = int(n)
def set_interaction_count(self, n: int):
"""
Set the interaction count manually.
"""
self._interaction_count = int(n)
@property
def interaction_count(self) > int:
"""
Get the total number of simulator interactions made.
"""
return self._interaction_count
@property
def episode_count(self) > int:
"""
Get the total number of episodes completed.
"""
return self._episode_count
def _get_local_episode_count(self) > int:
return self.episode_count
def _get_local_interaction_count(self) > int:
return self.interaction_count
def _evaluate_network(self, policy: nn.Module) > Union[float, torch.Tensor]:
result = self.run(
policy,
update_stats=True,
visualize=False,
num_episodes=self._num_episodes,
decrease_rewards_by=self._decrease_rewards_by,
)
return result["cumulative_reward"]
def to_policy(self, x: Iterable, *, clip_actions: bool = True) > nn.Module:
"""
Convert the given parameter vector to a policy as a PyTorch module.
If the problem is configured to have observation normalization,
the PyTorch module also contains an additional normalization layer.
Args:
x: An sequence of real numbers, containing the parameters
of a policy. Can be a PyTorch tensor, a numpy array,
or a Solution.
clip_actions: Whether or not to add an action clipping layer so
that the generated actions will always be within an
acceptable range for the environment.
Returns:
The policy expressed by the parameters.
"""
policy = self.make_net(x)
if self.observation_normalization and (self._obs_stats.count > 0):
policy = ObsNormWrapperModule(policy, self._obs_stats)
if clip_actions and isinstance(self._get_env().action_space, gym.spaces.Box):
policy = ActClipWrapperModule(policy, self._get_env().action_space)
return policy
def save_solution(self, solution: Iterable, fname: Union[str, Path]):
"""
Save the solution into a pickle file.
Among the saved data within the pickle file are the solution
(as a PyTorch tensor), the policy (as a `torch.nn.Module` instance),
and observation stats (if any).
Args:
solution: The solution to be saved. This can be a PyTorch tensor,
a `Solution` instance, or any `Iterable`.
fname: The file name of the pickle file to be created.
"""
# Convert the solution to a PyTorch tensor on the cpu.
if isinstance(solution, torch.Tensor):
solution = solution.to("cpu")
elif isinstance(solution, Solution):
solution = solution.values.clone().to("cpu")
else:
solution = torch.as_tensor(solution, dtype=torch.float32, device="cpu")
if isinstance(solution, ReadOnlyTensor):
solution = solution.as_subclass(torch.Tensor)
policy = self.to_policy(solution).to("cpu")
# Store the solution and the policy.
result = {
"solution": solution,
"policy": policy,
}
# If available, store the observation stats.
if self.observation_normalization and (self._obs_stats is not None):
result["obs_mean"] = torch.as_tensor(self._obs_stats.mean)
result["obs_stdev"] = torch.as_tensor(self._obs_stats.stdev)
result["obs_sum"] = torch.as_tensor(self._obs_stats.sum)
result["obs_sum_of_squares"] = torch.as_tensor(self._obs_stats.sum_of_squares)
# Some additional data.
result["interaction_count"] = self.interaction_count
result["episode_count"] = self.episode_count
result["time"] = datetime.now()
# If the environment is specified via a string ID, then store that ID.
if isinstance(self._env_maker, str):
result["env"] = self._env_maker
# Save the dictionary which stores the data.
with open(fname, "wb") as f:
pickle.dump(result, f)
def get_env(self) > gym.Env:
"""
Get the gym environment stored by this GymNE instance
"""
return self._get_env()
episode_count: int
property
readonly
¶
Get the total number of episodes completed.
interaction_count: int
property
readonly
¶
Get the total number of simulator interactions made.
network_device: Union[str, torch.device]
property
readonly
¶
The device on which the problem should place data e.g. the network In the case of GymNE, supported Gym environments return numpy arrays on CPU which are converted to Tensors Therefore, it is almost always optimal to place the network on CPU
observation_normalization: bool
property
readonly
¶
Get whether or not observation normalization is enabled.
__init__(self, env=None, network=None, *, env_name=None, network_args=None, env_config=None, observation_normalization=False, num_episodes=1, episode_length=None, decrease_rewards_by=None, alive_bonus_schedule=None, action_noise_stdev=None, num_actors=None, actor_config=None, num_subbatches=None, subbatch_size=None, initial_bounds=(1e05, 1e05))
special
¶
__init__(...)
: Initialize the GymNE.
Parameters:
Name  Type  Description  Default 

env 
Union[str, Callable] 
The gym environment to solve. Expected as a Callable (maybe a function returning a gym.Env, or maybe a gym.Env subclass), or as a string referring to a gym environment ID (e.g. "Antv4", "Humanoidv4", etc.). 
None 
network 
Union[str, torch.nn.modules.module.Module, Callable[[], torch.nn.modules.module.Module]] 
A network structure string, or a Callable (which can be
a class inheriting from 
None 
env_name 
Union[str, Callable] 
Deprecated alias for the keyword argument 
None 
network_args 
Optional[dict] 
Optionally a dictlike object, storing keyword arguments to be passed to the network while instantiating it. 
None 
env_config 
Optional[collections.abc.Mapping] 
Keyword arguments to pass to 
None 
observation_normalization 
bool 
Whether or not to do online observation normalization. 
False 
num_episodes 
int 
Number of episodes over which a single solution will be evaluated. 
1 
episode_length 
Optional[int] 
Maximum amount of simulator interactions allowed
in a single episode. If left as None, whether or not an episode
is terminated is determined only by the 
None 
decrease_rewards_by 
Optional[float] 
Some gym env.s are defined in such a way that
the agent gets a constant reward for each timestep
it survives. This constant reward can also be called
"survival bonus". Such a rewarding scheme can lead the
evolution to local optima where the agent does nothing
but does not die either, just to collect the survival
bonuses. To prevent this, it can be desired to
remove the survival bonuses from each reward obtained.
If this is the case with the problem at hand,
the user can set the argument 
None 
alive_bonus_schedule 
Optional[tuple] 
Use this to add a customized amount of
alive bonus.
If left as None (which is the default), additional alive
bonus will not be added.
If given as a tuple 
None 
action_noise_stdev 
Optional[float] 
If given as a real number 
None 
num_actors 
Union[int, str] 
Number of actors to create for parallelized
evaluation of the solutions.
One can also set this as "max", which means that
an actor will be created on each available CPU.
When the parallelization is enabled each actor will have its
own instance of the 
None 
actor_config 
Optional[dict] 
A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass 
None 
num_subbatches 
Optional[int] 
If 
None 
subbatch_size 
Optional[int] 
If 
None 
initial_bounds 
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] 
Specifies an interval from which the values of the initial policy parameters will be drawn. 
(1e05, 1e05) 
Source code in evotorch/neuroevolution/gymne.py
def __init__(
self,
env: Optional[Union[str, Callable]] = None,
network: Optional[Union[str, nn.Module, Callable[[], nn.Module]]] = None,
*,
env_name: Optional[Union[str, Callable]] = None,
network_args: Optional[dict] = None,
env_config: Optional[Mapping] = None,
observation_normalization: bool = False,
num_episodes: int = 1,
episode_length: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
alive_bonus_schedule: Optional[tuple] = None,
action_noise_stdev: Optional[float] = None,
num_actors: Optional[Union[int, str]] = None,
actor_config: Optional[dict] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
initial_bounds: Optional[BoundsPairLike] = (0.00001, 0.00001),
):
"""
`__init__(...)`: Initialize the GymNE.
Args:
env: The gym environment to solve. Expected as a Callable
(maybe a function returning a gym.Env, or maybe a gym.Env
subclass), or as a string referring to a gym environment
ID (e.g. "Antv4", "Humanoidv4", etc.).
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network policy whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
Note that this network can be a recurrent network.
When the network's `forward(...)` method can optionally accept
an additional positional argument for the hidden state of the
network and returns an additional value for its next state,
then the policy is treated as a recurrent one.
When the network is given as a callable object (e.g.
a subclass of `nn.Module` or a function) and this callable
object is decorated via `evotorch.decorators.pass_info`,
the following keyword arguments will be passed:
(i) `obs_length` (the length of the observation vector),
(ii) `act_length` (the length of the action vector),
(iii) `obs_shape` (the shape tuple of the observation space),
(iv) `act_shape` (the shape tuple of the action space),
(v) `obs_space` (the Box object specifying the observation
space, and
(vi) `act_space` (the Box object specifying the action
space). Note that `act_space` will always be given as a
`gym.spaces.Box` instance, even when the actual gym
environment has a discrete action space. This because `GymNE`
always expects the neural network to return a tensor of
floatingpoint numbers.
env_name: Deprecated alias for the keyword argument `env`.
It is recommended to use the argument `env` instead.
network_args: Optionally a dictlike object, storing keyword
arguments to be passed to the network while instantiating it.
env_config: Keyword arguments to pass to `gym.make(...)` while
creating the `gym` environment.
observation_normalization: Whether or not to do online observation
normalization.
num_episodes: Number of episodes over which a single solution will
be evaluated.
episode_length: Maximum amount of simulator interactions allowed
in a single episode. If left as None, whether or not an episode
is terminated is determined only by the `gym` environment
itself.
decrease_rewards_by: Some gym env.s are defined in such a way that
the agent gets a constant reward for each timestep
it survives. This constant reward can also be called
"survival bonus". Such a rewarding scheme can lead the
evolution to local optima where the agent does nothing
but does not die either, just to collect the survival
bonuses. To prevent this, it can be desired to
remove the survival bonuses from each reward obtained.
If this is the case with the problem at hand,
the user can set the argument `decrease_rewards_by`
to a positive float number, and that number will
be subtracted from each reward.
alive_bonus_schedule: Use this to add a customized amount of
alive bonus.
If left as None (which is the default), additional alive
bonus will not be added.
If given as a tuple `(t, b)`, an alive bonus `b` will be
added onto all the rewards beyond the timestep `t`.
If given as a tuple `(t0, t1, b)`, a partial (linearly
increasing towards `b`) alive bonus will be added onto
all the rewards between the timesteps `t0` and `t1`,
and a full alive bonus (which equals to `b`) will be added
onto all the rewards beyond the timestep `t1`.
action_noise_stdev: If given as a real number `s`, then, for
each generated action, Gaussian noise with standard
deviation `s` will be sampled, and then this sampled noise
will be added onto the action.
If action noise is not desired, then this argument can be
left as None.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
One can also set this as "max", which means that
an actor will be created on each available CPU.
When the parallelization is enabled each actor will have its
own instance of the `gym` environment.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
initial_bounds: Specifies an interval from which the values of the
initial policy parameters will be drawn.
"""
# Store various environment information
if (env is not None) and (env_name is None):
self._env_maker = env
elif (env is None) and (env_name is not None):
self._env_maker = env_name
elif (env is not None) and (env_name is not None):
raise ValueError(
f"Received values for both `env` ({repr(env)}) and `env_name` ({repr(env_name)})."
f" Please specify the environment to solve via only one of these arguments, not both."
)
else:
raise ValueError("Environment name is missing. Please specify it via the argument `env`.")
# Make sure that the network argument is not missing.
if network is None:
raise ValueError(
"Received None via the argument `network`."
"Please provide the network as a string, or as a `Callable`, or as a `torch.nn.Module` instance."
)
# Store various environment information
self._env_config = {} if env_config is None else deepcopy(dict(env_config))
self._decrease_rewards_by = 0.0 if decrease_rewards_by is None else float(decrease_rewards_by)
self._alive_bonus_schedule = alive_bonus_schedule
self._action_noise_stdev = None if action_noise_stdev is None else float(action_noise_stdev)
self._observation_normalization = bool(observation_normalization)
self._num_episodes = int(num_episodes)
self._episode_length = None if episode_length is None else int(episode_length)
self._info_keys = dict(cumulative_reward="avg", interaction_count="sum")
self._env: Optional[gym.Env] = None
self._obs_stats: Optional[RunningStat] = None
self._collected_stats: Optional[RunningStat] = None
# Create a temporary environment to read its dimensions
tmp_env = _make_env(self._env_maker, **(self._env_config))
# Store the temporary environment's dimensions
self._obs_length = len(tmp_env.observation_space.low)
if isinstance(tmp_env.action_space, gym.spaces.Discrete):
self._act_length = tmp_env.action_space.n
self._box_act_space = gym.spaces.Box(low=float("inf"), high=float("inf"), shape=(self._act_length,))
else:
self._act_length = len(tmp_env.action_space.low)
self._box_act_space = tmp_env.action_space
self._act_space = tmp_env.action_space
self._obs_space = tmp_env.observation_space
self._obs_shape = tmp_env.observation_space.low.shape
# Validate the space types of the environment
ensure_space_types(tmp_env)
if self._observation_normalization:
self._obs_stats = RunningStat()
self._collected_stats = RunningStat()
else:
self._obs_stats = None
self._collected_stats = None
self._interaction_count: int = 0
self._episode_count: int = 0
super().__init__(
objective_sense="max", # RL is maximization
network=network, # Using the policy as the network
network_args=network_args,
initial_bounds=initial_bounds,
num_actors=num_actors,
actor_config=actor_config,
subbatch_size=subbatch_size,
device="cpu",
)
self.after_eval_hook.append(self._extra_status)
get_env(self)
¶
get_observation_stats(self)
¶
pop_observation_stats(self)
¶
run(self, policy, *, update_stats=False, visualize=False, num_episodes=None, decrease_rewards_by=None)
¶
Evaluate the policy on the gym environment.
Parameters:
Name  Type  Description  Default 

policy 
Union[torch.nn.modules.module.Module, Iterable] 
The policy to be evaluated. This can be a torch module or a sequence of real numbers representing the parameters of a policy network. 
required 
update_stats 
bool 
Whether or not to update the observation normalization data while running the policy. If observation normalization is not enabled, then this argument will be ignored. 
False 
visualize 
bool 
Whether or not to render the environment while running the policy. 
False 
num_episodes 
Optional[int] 
Over how many episodes will the policy be evaluated.
Expected as None (which is the default), or as an integer.
If given as None, then the 
None 
decrease_rewards_by 
Optional[float] 
How much each reward value should be
decreased. If left as None, the 
None 
Returns:
Type  Description 

dict 
A dictionary containing the score and the timestep count. 
Source code in evotorch/neuroevolution/gymne.py
def run(
self,
policy: Union[nn.Module, Iterable],
*,
update_stats: bool = False,
visualize: bool = False,
num_episodes: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
) > dict:
"""
Evaluate the policy on the gym environment.
Args:
policy: The policy to be evaluated. This can be a torch module
or a sequence of real numbers representing the parameters
of a policy network.
update_stats: Whether or not to update the observation
normalization data while running the policy. If observation
normalization is not enabled, then this argument will be
ignored.
visualize: Whether or not to render the environment while running
the policy.
num_episodes: Over how many episodes will the policy be evaluated.
Expected as None (which is the default), or as an integer.
If given as None, then the `num_episodes` value that was given
while initializing this GymNE will be used.
decrease_rewards_by: How much each reward value should be
decreased. If left as None, the `decrease_rewards_by` value
value that was given while initializing this GymNE will be
used.
Returns:
A dictionary containing the score and the timestep count.
"""
if not isinstance(policy, nn.Module):
policy = self.make_net(policy)
if num_episodes is None:
num_episodes = self._num_episodes
try:
policy.eval()
episode_results = [
self._rollout(
policy=policy,
update_stats=update_stats,
visualize=visualize,
decrease_rewards_by=decrease_rewards_by,
)
for _ in range(num_episodes)
]
results = _accumulate_all_across_dicts(episode_results, self._info_keys)
return results
finally:
policy.train()
save_solution(self, solution, fname)
¶
Save the solution into a pickle file.
Among the saved data within the pickle file are the solution
(as a PyTorch tensor), the policy (as a torch.nn.Module
instance),
and observation stats (if any).
Parameters:
Name  Type  Description  Default 

solution 
Iterable 
The solution to be saved. This can be a PyTorch tensor,
a 
required 
fname 
Union[str, pathlib.Path] 
The file name of the pickle file to be created. 
required 
Source code in evotorch/neuroevolution/gymne.py
def save_solution(self, solution: Iterable, fname: Union[str, Path]):
"""
Save the solution into a pickle file.
Among the saved data within the pickle file are the solution
(as a PyTorch tensor), the policy (as a `torch.nn.Module` instance),
and observation stats (if any).
Args:
solution: The solution to be saved. This can be a PyTorch tensor,
a `Solution` instance, or any `Iterable`.
fname: The file name of the pickle file to be created.
"""
# Convert the solution to a PyTorch tensor on the cpu.
if isinstance(solution, torch.Tensor):
solution = solution.to("cpu")
elif isinstance(solution, Solution):
solution = solution.values.clone().to("cpu")
else:
solution = torch.as_tensor(solution, dtype=torch.float32, device="cpu")
if isinstance(solution, ReadOnlyTensor):
solution = solution.as_subclass(torch.Tensor)
policy = self.to_policy(solution).to("cpu")
# Store the solution and the policy.
result = {
"solution": solution,
"policy": policy,
}
# If available, store the observation stats.
if self.observation_normalization and (self._obs_stats is not None):
result["obs_mean"] = torch.as_tensor(self._obs_stats.mean)
result["obs_stdev"] = torch.as_tensor(self._obs_stats.stdev)
result["obs_sum"] = torch.as_tensor(self._obs_stats.sum)
result["obs_sum_of_squares"] = torch.as_tensor(self._obs_stats.sum_of_squares)
# Some additional data.
result["interaction_count"] = self.interaction_count
result["episode_count"] = self.episode_count
result["time"] = datetime.now()
# If the environment is specified via a string ID, then store that ID.
if isinstance(self._env_maker, str):
result["env"] = self._env_maker
# Save the dictionary which stores the data.
with open(fname, "wb") as f:
pickle.dump(result, f)
set_episode_count(self, n)
¶
set_interaction_count(self, n)
¶
set_observation_stats(self, rs)
¶
to_policy(self, x, *, clip_actions=True)
¶
Convert the given parameter vector to a policy as a PyTorch module.
If the problem is configured to have observation normalization, the PyTorch module also contains an additional normalization layer.
Parameters:
Name  Type  Description  Default 

x 
Iterable 
An sequence of real numbers, containing the parameters of a policy. Can be a PyTorch tensor, a numpy array, or a Solution. 
required 
clip_actions 
bool 
Whether or not to add an action clipping layer so that the generated actions will always be within an acceptable range for the environment. 
True 
Returns:
Type  Description 

Module 
The policy expressed by the parameters. 
Source code in evotorch/neuroevolution/gymne.py
def to_policy(self, x: Iterable, *, clip_actions: bool = True) > nn.Module:
"""
Convert the given parameter vector to a policy as a PyTorch module.
If the problem is configured to have observation normalization,
the PyTorch module also contains an additional normalization layer.
Args:
x: An sequence of real numbers, containing the parameters
of a policy. Can be a PyTorch tensor, a numpy array,
or a Solution.
clip_actions: Whether or not to add an action clipping layer so
that the generated actions will always be within an
acceptable range for the environment.
Returns:
The policy expressed by the parameters.
"""
policy = self.make_net(x)
if self.observation_normalization and (self._obs_stats.count > 0):
policy = ObsNormWrapperModule(policy, self._obs_stats)
if clip_actions and isinstance(self._get_env().action_space, gym.spaces.Box):
policy = ActClipWrapperModule(policy, self._get_env().action_space)
return policy
update_observation_stats(self, rs)
¶
visualize(self, policy, *, update_stats=False, num_episodes=1, decrease_rewards_by=None)
¶
Evaluate the policy and render its actions in the environment.
Parameters:
Name  Type  Description  Default 

policy 
Union[torch.nn.modules.module.Module, Iterable] 
The policy to be evaluated. This can be a torch module or a sequence of real numbers representing the parameters of a policy network. 
required 
update_stats 
bool 
Whether or not to update the observation normalization data while running the policy. If observation normalization is not enabled, then this argument will be ignored. 
False 
num_episodes 
Optional[int] 
Over how many episodes will the policy be evaluated.
Expected as None (which is the default), or as an integer.
If given as None, then the 
1 
decrease_rewards_by 
Optional[float] 
How much each reward value should be
decreased. If left as None, the 
None 
Returns:
Type  Description 

dict 
A dictionary containing the score and the timestep count. 
Source code in evotorch/neuroevolution/gymne.py
def visualize(
self,
policy: Union[nn.Module, Iterable],
*,
update_stats: bool = False,
num_episodes: Optional[int] = 1,
decrease_rewards_by: Optional[float] = None,
) > dict:
"""
Evaluate the policy and render its actions in the environment.
Args:
policy: The policy to be evaluated. This can be a torch module
or a sequence of real numbers representing the parameters
of a policy network.
update_stats: Whether or not to update the observation
normalization data while running the policy. If observation
normalization is not enabled, then this argument will be
ignored.
num_episodes: Over how many episodes will the policy be evaluated.
Expected as None (which is the default), or as an integer.
If given as None, then the `num_episodes` value that was given
while initializing this GymNE will be used.
decrease_rewards_by: How much each reward value should be
decreased. If left as None, the `decrease_rewards_by` value
value that was given while initializing this GymNE will be
used.
Returns:
A dictionary containing the score and the timestep count.
"""
return self.run(
policy=policy,
update_stats=update_stats,
visualize=True,
num_episodes=num_episodes,
decrease_rewards_by=decrease_rewards_by,
)
neproblem
¶
This namespace contains the NeuroevolutionProblem
class.
NEProblem (BaseNEProblem)
¶
Base class for neuroevolution problems where the goal is to optimize the parameters of a neural network represented as a PyTorch module.
Any problem inheriting from this class is expected to override the method
_evaluate_network(self, net: torch.nn.Module) > Union[torch.Tensor, float]
where net
is the neural network to be evaluated, and the return value
is a scalar or a vector (for multiobjective cases) expressing the
fitness value(s).
Alternatively, this class can be directly instantiated in the following way:
def f(module: MyTorchModuleClass) > Union[float, torch.Tensor, tuple]:
# Evaluate the given PyTorch module here
fitness = ...
return fitness
problem = NEProblem("min", MyTorchModuleClass, f, ...)
which specifies that the problem's goal is to minimize the return of the
function f
.
For multiobjective cases, the fitness returned by f
is expected as a
1dimensional tensor. For when the problem has additional evaluation data,
a twoelement tuple can be returned by f
instead, where the first
element is the fitness value(s) and the second element is a 1dimensional
tensor storing the additional data.
Source code in evotorch/neuroevolution/neproblem.py
class NEProblem(BaseNEProblem):
"""
Base class for neuroevolution problems where the goal is to optimize the
parameters of a neural network represented as a PyTorch module.
Any problem inheriting from this class is expected to override the method
`_evaluate_network(self, net: torch.nn.Module) > Union[torch.Tensor, float]`
where `net` is the neural network to be evaluated, and the return value
is a scalar or a vector (for multiobjective cases) expressing the
fitness value(s).
Alternatively, this class can be directly instantiated in the following
way:
```python
def f(module: MyTorchModuleClass) > Union[float, torch.Tensor, tuple]:
# Evaluate the given PyTorch module here
fitness = ...
return fitness
problem = NEProblem("min", MyTorchModuleClass, f, ...)
```
which specifies that the problem's goal is to minimize the return of the
function `f`.
For multiobjective cases, the fitness returned by `f` is expected as a
1dimensional tensor. For when the problem has additional evaluation data,
a twoelement tuple can be returned by `f` instead, where the first
element is the fitness value(s) and the second element is a 1dimensional
tensor storing the additional data.
"""
def __init__(
self,
objective_sense: ObjectiveSense,
network: Union[str, nn.Module, Callable[[], nn.Module]],
network_eval_func: Optional[Callable] = None,
*,
network_args: Optional[dict] = None,
initial_bounds: Optional[BoundsPairLike] = (0.00001, 0.00001),
eval_dtype: Optional[DType] = None,
eval_data_length: int = 0,
seed: Optional[int] = None,
num_actors: Optional[Union[int, str]] = None,
actor_config: Optional[dict] = None,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
device: Optional[Device] = None,
):
"""
`__init__(...)`: Initialize the NEProblem.
Args:
objective_sense: The objective sense, expected as "min" or "max"
for singleobjective cases, or as a sequence of strings
(each string being "min" or "max") for multiobjective cases.
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
network_eval_func: Optionally a function (or any Callable object)
which receives a PyTorch module as its argument, and returns
either a fitness, or a twoelement tuple containing the fitness
and the additional evaluation data. The fitness can be a scalar
(for singleobjective cases) or a 1dimensional tensor (for
multiobjective cases). The additional evaluation data is
expected as a 1dimensional tensor.
If this argument is left as None, it will be expected that
the method `_evaluate_network(...)` is overriden by the
inheriting class.
network_args: Optionally a dictlike object, storing keyword
arguments to be passed to the network while instantiating it.
initial_bounds: Specifies an interval from which the values of the
initial neural network parameters will be drawn.
eval_dtype: dtype to be used for fitnesses. If not specified, then
`eval_dtype` will be inferred from the dtype of the parameters
of the neural network.
In more details, if the neural network's parameters have a
float dtype, `eval_dtype` will be a compatible float.
Otherwise, it will be "float32".
eval_data_length: Length of the extra evaluation data.
seed: Random number seed. If left as None, this NEProblem instance
will not have its own random generator, and the global random
generator of PyTorch will be used instead.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many subbatches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a subbatch (or subpopulation) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
device: Default device in which a new population will be generated
and the neural networks will operate.
If not specified, "cpu" will be used.
"""
# Set the main device of the problem
# Although the operation of setting the main device is done by the main Problem class,
# here we need this at an earlier stage.
if device is None:
device = "cpu"
self._device = torch.device(device)
# Set the network
self._original_network = network
self._network_args = {} if network_args is None else deepcopy(network_args)
if isinstance(self._original_network, nn.Module):
self._original_network = self._original_network.cpu()
# Store the function that will evaluate the network, if available
self._network_eval_func: Optional[Callable] = network_eval_func
self.instantiated_network: nn.Module = None
# Create temporary network
temp_network = self._instantiate_net(self._original_network, device="cpu")
super().__init__(
objective_sense=objective_sense,
initial_bounds=initial_bounds,
bounds=None, # Neuroevolution is an unbounded problem
solution_length=count_parameters(temp_network), # The solution length is inherited from the network passed
dtype=next(temp_network.parameters()).dtype, # The datatype is inherited from the network passed
eval_dtype=eval_dtype,
device=device,
eval_data_length=eval_data_length,
seed=seed,
num_actors=num_actors,
num_gpus_per_actor=num_gpus_per_actor,
actor_config=actor_config,
num_subbatches=num_subbatches,
subbatch_size=subbatch_size,
store_solution_stats=None,
)
@property
def network_device(self) > Device:
"""The device on which the problem should place data e.g. the network"""
cpu_device = torch.device("cpu")
if self.is_main:
# This is the case where this is the main process (not a remote actor)
if self.device == cpu_device:
# If the main device of the problem is "cpu", then we assume that the network is going to be on the cpu as well
return cpu_device
else:
# If the main device of the problem is some other device, then it is that device into which the network will be put
return self.device
else:
# If this is a remote actor, then the network will be put into the auxiliary device allocated for that actor
return self.aux_device
@property
def _str_network_constants(self) > dict:
"""
Named constants which will be passed to `str_to_net`.
To be overridden by the user for custom fixed constants for a problem.
"""
return {}
@property
def _network_constants(self) > dict:
"""
Named constants which will be passed to the network instantiation.
To be overridden by the user for custom fixed constants for a problem.
"""
return {}
def network_constants(self) > dict:
"""Named constants which can be passed to the network instantiation"""
constants = {}
constants.update(self._network_constants)
constants.update(self._network_args)
return constants
@property
def _nonserialized_attribs(self) > List[str]:
return ["instantiated_network"]
def _instantiate_net(self, network: Union[str, nn.Module, dict], device: Optional[Device] = None) > nn.Module:
"""Instantiate the network on the target device, to be overridden by the user for custom behaviour
Returns:
instantiated_network (nn.Module): The network instantiated on the target device
"""
# Branching point determines instantiation of network
if isinstance(network, str):
# Passed argument was a string representation of a torch module
net_consts = {}
net_consts.update(self.network_constants())
net_consts.update(self._str_network_constants)
instantiated_network = str_to_net(network, **net_consts)
elif isinstance(network, nn.Module):
# Passed argument was directly a torch module
instantiated_network = network
else:
# Passed argument was callable yielding network
instantiated_network = pass_info_if_needed(network, self._network_constants)(**self._network_args)
# Map to device
device = self.network_device if device is None else device
instantiated_network = instantiated_network.to(device)
return instantiated_network
def _prepare(self) > None:
"""Instantiate the network on the target device, if not already done"""
self.instantiated_network = self._instantiate_net(self._original_network)
# Clear reference to original network
self._original_network = None
def make_net(self, parameters: Iterable) > nn.Module:
"""
Make a new network filled with the provided parameters.
Args:
parameters: Parameters to be used as weights within the network.
Can be a Solution, or any 1dimensional Iterable that can be
converted to a PyTorch tensor.
Returns:
A new network, as a `torch.Module` instance.
"""
if isinstance(parameters, Solution):
parameters = parameters.access_values(keep_evals=True)
else:
parameters = self.as_tensor(parameters)
with torch.no_grad():
net = deepcopy(self.parameterize_net(parameters))
return net
def parameterize_net(self, parameters: torch.Tensor) > nn.Module:
"""Parameterize the network with a given set of parameters.
Args:
parameters (torch.Tensor): The parameters with which to instantiate the network
Returns:
instantiated_network (nn.Module): The network instantiated with the parameters
"""
# Check if network exists
if self.instantiated_network is None:
self.instantiated_network = self._instantiate_net(self._original_network)
network = self.instantiated_network
# Move the parameters if needed
if parameters.device != self.network_device:
parameters = parameters.to(self.network_device)
# Fill the network with the parameters
fill_parameters(network, parameters)
# Return the network
return network
@property
def _grad_device(self) > Device:
"""
Get the device in which new solutions will be made in distributed mode.
In more details, in distributed mode, each actor creates its own
subpopulations, evaluates them, and computes its own gradient
(all such actor gradients eventually being collected by the
distributionbased search algorithm in the main process).
For some problem types, it can make sense for the remote actors to
create their temporary subpopulations on another device
(e.g. on the GPU that is allocated specifically for them).
For such situations, one is encouraged to override this property
and make it return whatever device is to be used.
In the case of NEProblem, this property returns whatever device
is specified by the property `network_device`.
"""
return self.network_device
def _evaluate_network(self, network: nn.Module) > Union[float, torch.Tensor, tuple]:
"""
Evaluate a network and return the evaluation result(s).
In the case where the `__init__` of `NEProblem` was not given
a network evaluator function (via the argument `network_eval_func`),
it will be expected that the inheriting class overrides this
method and defines how a network should be evaluated.
Args:
network (nn.Module): The network to evaluate
Returns:
fitness: The networks' fitness value(s), as a scalar for
singleobjective cases, or as a 1dimensional tensor
for multiobjective cases. The returned value can also
be a twoelement tuple where the first element is the
fitness (as a scalar or as a vector) and the second
element is a 1dimensional vector storing the extra
evaluation data.
"""
raise NotImplementedError
def _evaluate(self, solution: Solution):
"""
Evaluate a single solution.
This is achieved by parameterising the problem's attribute
named `instantiated_network`, and then evaluating the network
with the method `_evaluate_network(...)`.
Args:
solution (Solution): The solution to evaluate.
"""
parameters = solution.values
if self._network_eval_func is None:
evaluator = self._evaluate_network
else:
evaluator = self._network_eval_func
fitnesses = evaluator(self.parameterize_net(parameters))
if isinstance(fitnesses, tuple):
solution.set_evals(*fitnesses)
else:
solution.set_evals(fitnesses)
network_device: Union[str, torch.device]
property
readonly
¶
The device on which the problem should place data e.g. the network
__init__(self, objective_sense, network, network_eval_func=None, *, network_args=None, initial_bounds=(1e05, 1e05), eval_dtype=None, eval_data_length=0, seed=None, num_actors=None, actor_config=None, num_gpus_per_actor=None, num_subbatches=None, subbatch_size=None, device=None)
special
¶
__init__(...)
: Initialize the NEProblem.
Parameters:
Name  Type  Description  Default 

objective_sense 
Union[str, Iterable[str]] 
The objective sense, expected as "min" or "max" for singleobjective cases, or as a sequence of strings (each string being "min" or "max") for multiobjective cases. 
required 
network 
Union[str, torch.nn.modules.module.Module, Callable[[], torch.nn.modules.module.Module]] 
A network structure string, or a Callable (which can be
a class inheriting from 
required 
network_eval_func 
Optional[Callable] 
Optionally a function (or any Callable object)
which receives a PyTorch module as its argument, and returns
either a fitness, or a twoelement tuple containing the fitness
and the additional evaluation data. The fitness can be a scalar
(for singleobjective cases) or a 1dimensional tensor (for
multiobjective cases). The additional evaluation data is
expected as a 1dimensional tensor.
If this argument is left as None, it will be expected that
the method 
None 
network_args 
Optional[dict] 
Optionally a dictlike object, storing keyword arguments to be passed to the network while instantiating it. 
None 
initial_bounds 
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] 
Specifies an interval from which the values of the initial neural network parameters will be drawn. 
(1e05, 1e05) 
eval_dtype 
Union[str, torch.dtype, numpy.dtype, Type] 
dtype to be used for fitnesses. If not specified, then

None 
eval_data_length 
int 
Length of the extra evaluation data. 
0 
seed 
Optional[int] 
Random number seed. If left as None, this NEProblem instance will not have its own random generator, and the global random generator of PyTorch will be used instead. 
None 
num_actors 
Union[int, str] 
Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If 
None 
actor_config 
Optional[dict] 
A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass 
None 
num_gpus_per_actor 
Union[int, float, str] 
Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number 
None 
num_subbatches 
Optional[int] 
If 
None 
subbatch_size 
Optional[int] 
If 
None 
device 
Union[str, torch.device] 
Default device in which a new population will be generated and the neural networks will operate. If not specified, "cpu" will be used. 
None 
Source code in evotorch/neuroevolution/neproblem.py
def __init__(
self,
objective_sense: ObjectiveSense,
network: Union[str, nn.Module, Callable[[], nn.Module]],
network_eval_func: Optional[Callable] = None,
*,
network_args: Optional[dict] = None,
initial_bounds: Optional[BoundsPairLike] = (0.00001, 0.00001),
eval_dtype: Optional[DType] = None,
eval_data_length: int = 0,
seed: Optional[int] = None,
num_actors: Optional[Union[int, str]] = None,
actor_config: Optional[dict] = None,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
device: Optional[Device] = None,
):
"""
`__init__(...)`: Initialize the NEProblem.
Args:
objective_sense: The objective sense, expected as "min" or "max"
for singleobjective cases, or as a sequence of strings
(each string being "min" or "max") for multiobjective cases.
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
network_eval_func: Optionally a function (or any Callable object)
which receives a PyTorch module as its argument, and returns
either a fitness, or a twoelement tuple containing the fitness
and the additional evaluation data. The fitness can be a scalar
(for singleobjective cases) or a 1dimensional tensor (for
multiobjective cases). The additional evaluation data is
expected as a 1dimensional tensor.
If this argument is left as None, it will be expected that
the method `_evaluate_network(...)` is overriden by the
inheriting class.
network_args: Optionally a dictlike object, storing keyword
arguments to be passed to the network while instantiating it.
initial_bounds: Specifies an interval from which the values of the
initial neural network parameters will be drawn.
eval_dtype: dtype to be used for fitnesses. If not specified, then
`eval_dtype` will be inferred from the dtype of the parameters
of the neural network.
In more details, if the neural network's parameters have a
float dtype, `eval_dtype` will be a compatible float.
Otherwise, it will be "float32".
eval_data_length: Length of the extra evaluation data.
seed: Random number seed. If left as None, this NEProblem instance
will not have its own random generator, and the global random
generator of PyTorch will be used instead.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many subbatches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a subbatch (or subpopulation) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
device: Default device in which a new population will be generated
and the neural networks will operate.
If not specified, "cpu" will be used.
"""
# Set the main device of the problem
# Although the operation of setting the main device is done by the main Problem class,
# here we need this at an earlier stage.
if device is None:
device = "cpu"
self._device = torch.device(device)
# Set the network
self._original_network = network
self._network_args = {} if network_args is None else deepcopy(network_args)
if isinstance(self._original_network, nn.Module):
self._original_network = self._original_network.cpu()
# Store the function that will evaluate the network, if available
self._network_eval_func: Optional[Callable] = network_eval_func
self.instantiated_network: nn.Module = None
# Create temporary network
temp_network = self._instantiate_net(self._original_network, device="cpu")
super().__init__(
objective_sense=objective_sense,
initial_bounds=initial_bounds,
bounds=None, # Neuroevolution is an unbounded problem
solution_length=count_parameters(temp_network), # The solution length is inherited from the network passed
dtype=next(temp_network.parameters()).dtype, # The datatype is inherited from the network passed
eval_dtype=eval_dtype,
device=device,
eval_data_length=eval_data_length,
seed=seed,
num_actors=num_actors,
num_gpus_per_actor=num_gpus_per_actor,
actor_config=actor_config,
num_subbatches=num_subbatches,
subbatch_size=subbatch_size,
store_solution_stats=None,
)
make_net(self, parameters)
¶
Make a new network filled with the provided parameters.
Parameters:
Name  Type  Description  Default 

parameters 
Iterable 
Parameters to be used as weights within the network. Can be a Solution, or any 1dimensional Iterable that can be converted to a PyTorch tensor. 
required 
Returns:
Type  Description 

Module 
A new network, as a 
Source code in evotorch/neuroevolution/neproblem.py
def make_net(self, parameters: Iterable) > nn.Module:
"""
Make a new network filled with the provided parameters.
Args:
parameters: Parameters to be used as weights within the network.
Can be a Solution, or any 1dimensional Iterable that can be
converted to a PyTorch tensor.
Returns:
A new network, as a `torch.Module` instance.
"""
if isinstance(parameters, Solution):
parameters = parameters.access_values(keep_evals=True)
else:
parameters = self.as_tensor(parameters)
with torch.no_grad():
net = deepcopy(self.parameterize_net(parameters))
return net
network_constants(self)
¶
Named constants which can be passed to the network instantiation
parameterize_net(self, parameters)
¶
Parameterize the network with a given set of parameters.
Parameters:
Name  Type  Description  Default 

parameters 
torch.Tensor 
The parameters with which to instantiate the network 
required 
Returns:
Type  Description 

instantiated_network (nn.Module) 
The network instantiated with the parameters 
Source code in evotorch/neuroevolution/neproblem.py
def parameterize_net(self, parameters: torch.Tensor) > nn.Module:
"""Parameterize the network with a given set of parameters.
Args:
parameters (torch.Tensor): The parameters with which to instantiate the network
Returns:
instantiated_network (nn.Module): The network instantiated with the parameters
"""
# Check if network exists
if self.instantiated_network is None:
self.instantiated_network = self._instantiate_net(self._original_network)
network = self.instantiated_network
# Move the parameters if needed
if parameters.device != self.network_device:
parameters = parameters.to(self.network_device)
# Fill the network with the parameters
fill_parameters(network, parameters)
# Return the network
return network
net
special
¶
Utility classes and functions for neural networks
functional
¶
ModuleExpectingFlatParameters
¶
A wrapper which brings a functional interface around a torch module.
Similar to functorch.FunctionalModule
, ModuleExpectingFlatParameters
turns a torch.nn.Module
instance to a function which expects a new
leftmost argument representing the parameters of the network.
Unlike functorch.FunctionalModule
, a ModuleExpectingFlatParameters
instance, as its name suggests, expects the network parameters to be
given as a 1dimensional (i.e. flattened) tensor.
Also, unlike functorch.FunctionalModule
, an instance of
ModuleExpectingFlatParameters
is NOT an instance of torch.nn.Module
.
PyTorch modules with buffers can be wrapped by this class, but it is assumed that those buffers are constant. If the wrapped module changes the value(s) of its buffer(s) during its forward passes, most probably things will NOT work right.
As an example, let us consider the following linear layer.
The functional counterpart of net
can be obtained via:
from evotorch.neuroevolution.net import ModuleExpectingFlatParameters
fnet = ModuleExpectingFlatParameters(net)
Now, fnet
is a callable object which expects network parameters
and network inputs. Let us call fnet
with randomly generated network
parameters and with a randomly generated input tensor.
param_length = fnet.parameter_length
random_parameters = torch.randn(param_length)
random_input = torch.randn(3)
result = fnet(random_parameters, random_input)
Source code in evotorch/neuroevolution/net/functional.py
class ModuleExpectingFlatParameters:
"""
A wrapper which brings a functional interface around a torch module.
Similar to `functorch.FunctionalModule`, `ModuleExpectingFlatParameters`
turns a `torch.nn.Module` instance to a function which expects a new
leftmost argument representing the parameters of the network.
Unlike `functorch.FunctionalModule`, a `ModuleExpectingFlatParameters`
instance, as its name suggests, expects the network parameters to be
given as a 1dimensional (i.e. flattened) tensor.
Also, unlike `functorch.FunctionalModule`, an instance of
`ModuleExpectingFlatParameters` is NOT an instance of `torch.nn.Module`.
PyTorch modules with buffers can be wrapped by this class, but it is
assumed that those buffers are constant. If the wrapped module changes
the value(s) of its buffer(s) during its forward passes, most probably
things will NOT work right.
As an example, let us consider the following linear layer.
```python
import torch
from torch import nn
net = nn.Linear(3, 8)
```
The functional counterpart of `net` can be obtained via:
```python
from evotorch.neuroevolution.net import ModuleExpectingFlatParameters
fnet = ModuleExpectingFlatParameters(net)
```
Now, `fnet` is a callable object which expects network parameters
and network inputs. Let us call `fnet` with randomly generated network
parameters and with a randomly generated input tensor.
```python
param_length = fnet.parameter_length
random_parameters = torch.randn(param_length)
random_input = torch.randn(3)
result = fnet(random_parameters, random_input)
```
"""
@torch.no_grad()
def __init__(self, net: nn.Module, *, disable_autograd_tracking: bool = False):
"""
`__init__(...)`: Initialize the `ModuleExpectingFlatParameters` instance.
Args:
net: The module that is to be wrapped by a functional interface.
disable_autograd_tracking: If given as True, all operations
regarding the wrapped module will be performed in the context
`torch.no_grad()`, forcefully disabling the autograd.
If given as False, autograd will not be affected.
The default is False.
"""
# Declare the variables which will store information regarding the parameters of the module.
self.__param_names = []
self.__param_shapes = []
self.__param_length = 0
self.__param_slices = []
self.__num_params = 0
# Iterate over the parameters of the module and fill the related information.
i = 0
j = 0
for pname, p in net.named_parameters():
self.__param_names.append(pname)
shape = p.shape
self.__param_shapes.append(shape)
length = _shape_length(shape)
self.__param_length += length
j = i + length
self.__param_slices.append(slice(i, j))
i = j
self.__num_params += 1
self.__buffer_dict = {bname: b.clone() for bname, b in net.named_buffers()}
self.__net = deepcopy(net)
self.__net.to("meta")
self.__disable_autograd_tracking = bool(disable_autograd_tracking)
def __transfer_buffers(self, x: torch.Tensor):
"""
Transfer the buffer tensors to the device of the given tensor.
Args:
x: The tensor whose device will also store the buffer tensors.
"""
for bname in self.__buffer_dict.keys():
self.__buffer_dict[bname] = torch.as_tensor(self.__buffer_dict[bname], device=x.device)
@property
def buffers(self) > tuple:
"""Get the stored buffers"""
return tuple(self.__buffer_dict)
@property
def parameter_length(self) > int:
return self.__param_length
def __call__(self, parameter_vector: torch.Tensor, x: torch.Tensor, h: Any = None) > Any:
"""
Call the wrapped module's forward pass procedure.
Args:
parameter_vector: A 1dimensional tensor which represents the
parameters of the tensor.
x: The inputs.
h: Hidden state(s), in case this is a recurrent network.
Returns:
The result of the forward pass.
"""
if parameter_vector.ndim != 1:
raise ValueError(
f"Expected the parameters as 1 dimensional,"
f" but the received parameter vector has {parameter_vector.ndim} dimensions"
)
if len(parameter_vector) != self.__param_length:
raise ValueError(
f"Expected a parameter vector of length {self.__param_length},"
f" but the received parameter vector's length is {len(parameter_vector)}."
)
state_args = [] if h is None else [h]
params_and_buffers = {}
for i, pname in enumerate(self.__param_names):
param_slice = self.__param_slices[i]
param_shape = self.__param_shapes[i]
param = parameter_vector[param_slice].reshape(param_shape)
params_and_buffers[pname] = param
# Make sure that the buffer tensors are in the same device with x
self.__transfer_buffers(x)
# Add the buffer tensors to the dictionary `params_and_buffers`
params_and_buffers.update(self.__buffer_dict)
# Prepare the nogradient context if gradient tracking is disabled
context = torch.no_grad() if self.__disable_autograd_tracking else nullcontext()
# Run the module and return the results
with context:
return functional_call(self.__net, params_and_buffers, tuple([x, *state_args]))
buffers: tuple
property
readonly
¶
Get the stored buffers
__call__(self, parameter_vector, x, h=None)
special
¶
Call the wrapped module's forward pass procedure.
Parameters:
Name  Type  Description  Default 

parameter_vector 
Tensor 
A 1dimensional tensor which represents the parameters of the tensor. 
required 
x 
Tensor 
The inputs. 
required 
h 
Any 
Hidden state(s), in case this is a recurrent network. 
None 
Returns:
Type  Description 

Any 
The result of the forward pass. 
Source code in evotorch/neuroevolution/net/functional.py
def __call__(self, parameter_vector: torch.Tensor, x: torch.Tensor, h: Any = None) > Any:
"""
Call the wrapped module's forward pass procedure.
Args:
parameter_vector: A 1dimensional tensor which represents the
parameters of the tensor.
x: The inputs.
h: Hidden state(s), in case this is a recurrent network.
Returns:
The result of the forward pass.
"""
if parameter_vector.ndim != 1:
raise ValueError(
f"Expected the parameters as 1 dimensional,"
f" but the received parameter vector has {parameter_vector.ndim} dimensions"
)
if len(parameter_vector) != self.__param_length:
raise ValueError(
f"Expected a parameter vector of length {self.__param_length},"
f" but the received parameter vector's length is {len(parameter_vector)}."
)
state_args = [] if h is None else [h]
params_and_buffers = {}
for i, pname in enumerate(self.__param_names):
param_slice = self.__param_slices[i]
param_shape = self.__param_shapes[i]
param = parameter_vector[param_slice].reshape(param_shape)
params_and_buffers[pname] = param
# Make sure that the buffer tensors are in the same device with x
self.__transfer_buffers(x)
# Add the buffer tensors to the dictionary `params_and_buffers`
params_and_buffers.update(self.__buffer_dict)
# Prepare the nogradient context if gradient tracking is disabled
context = torch.no_grad() if self.__disable_autograd_tracking else nullcontext()
# Run the module and return the results
with context:
return functional_call(self.__net, params_and_buffers, tuple([x, *state_args]))
__init__(self, net, *, disable_autograd_tracking=False)
special
¶
__init__(...)
: Initialize the ModuleExpectingFlatParameters
instance.
Parameters:
Name  Type  Description  Default 

net 
Module 
The module that is to be wrapped by a functional interface. 
required 
disable_autograd_tracking 
bool 
If given as True, all operations
regarding the wrapped module will be performed in the context

False 
Source code in evotorch/neuroevolution/net/functional.py
@torch.no_grad()
def __init__(self, net: nn.Module, *, disable_autograd_tracking: bool = False):
"""
`__init__(...)`: Initialize the `ModuleExpectingFlatParameters` instance.
Args:
net: The module that is to be wrapped by a functional interface.
disable_autograd_tracking: If given as True, all operations
regarding the wrapped module will be performed in the context
`torch.no_grad()`, forcefully disabling the autograd.
If given as False, autograd will not be affected.
The default is False.
"""
# Declare the variables which will store information regarding the parameters of the module.
self.__param_names = []
self.__param_shapes = []
self.__param_length = 0
self.__param_slices = []
self.__num_params = 0
# Iterate over the parameters of the module and fill the related information.
i = 0
j = 0
for pname, p in net.named_parameters():
self.__param_names.append(pname)
shape = p.shape
self.__param_shapes.append(shape)
length = _shape_length(shape)
self.__param_length += length
j = i + length
self.__param_slices.append(slice(i, j))
i = j
self.__num_params += 1
self.__buffer_dict = {bname: b.clone() for bname, b in net.named_buffers()}
self.__net = deepcopy(net)
self.__net.to("meta")
self.__disable_autograd_tracking = bool(disable_autograd_tracking)
make_functional_module(net, *, disable_autograd_tracking=False)
¶
Wrap a torch module so that it has a functional interface.
Similar to functorch.make_functional(...)
, this function turns a
torch.nn.Module
instance to a function which expects a new leftmost
argument representing the parameters of the network.
Unlike with functorch.make_functional(...)
, the parameters of the
network are expected in a 1dimensional (i.e. flattened) tensor.
PyTorch modules with buffers can be wrapped by this class, but it is assumed that those buffers are constant. If the wrapped module changes the value(s) of its buffer(s) during its forward passes, most probably things will NOT work right.
As an example, let us consider the following linear layer.
The functional counterpart of net
can be obtained via:
Now, fnet
is a callable object which expects network parameters
and network inputs. Let us call fnet
with randomly generated network
parameters and with a randomly generated input tensor.
param_length = fnet.parameter_length
random_parameters = torch.randn(param_length)
random_input = torch.randn(3)
result = fnet(random_parameters, random_input)
Parameters:
Name  Type  Description  Default 

net 
Module 
The 
required 
disable_autograd_tracking 
bool 
If given as True, all operations
regarding the wrapped module will be performed in the context

False 
Returns:
Type  Description 

ModuleExpectingFlatParameters 
The functional wrapper, as an instance of

Source code in evotorch/neuroevolution/net/functional.py
def make_functional_module(net: nn.Module, *, disable_autograd_tracking: bool = False) > ModuleExpectingFlatParameters:
"""
Wrap a torch module so that it has a functional interface.
Similar to `functorch.make_functional(...)`, this function turns a
`torch.nn.Module` instance to a function which expects a new leftmost
argument representing the parameters of the network.
Unlike with `functorch.make_functional(...)`, the parameters of the
network are expected in a 1dimensional (i.e. flattened) tensor.
PyTorch modules with buffers can be wrapped by this class, but it is
assumed that those buffers are constant. If the wrapped module changes
the value(s) of its buffer(s) during its forward passes, most probably
things will NOT work right.
As an example, let us consider the following linear layer.
```python
import torch
from torch import nn
net = nn.Linear(3, 8)
```
The functional counterpart of `net` can be obtained via:
```python
from evotorch.neuroevolution.net import make_functional_module
fnet = make_functional_module(net)
```
Now, `fnet` is a callable object which expects network parameters
and network inputs. Let us call `fnet` with randomly generated network
parameters and with a randomly generated input tensor.
```python
param_length = fnet.parameter_length
random_parameters = torch.randn(param_length)
random_input = torch.randn(3)
result = fnet(random_parameters, random_input)
```
Args:
net: The `torch.nn.Module` instance to be wrapped by a functional
interface.
disable_autograd_tracking: If given as True, all operations
regarding the wrapped module will be performed in the context
`torch.no_grad()`, forcefully disabling the autograd.
If given as False, autograd will not be affected.
The default is False.
Returns:
The functional wrapper, as an instance of
`evotorch.neuroevolution.net.ModuleExpectingFlatParameters`.
"""
return ModuleExpectingFlatParameters(net, disable_autograd_tracking=disable_autograd_tracking)
layers
¶
Various neural network layer types
Apply (Module)
¶
A torch module for applying an arithmetic operator on an input tensor
Source code in evotorch/neuroevolution/net/layers.py
class Apply(nn.Module):
"""A torch module for applying an arithmetic operator on an input tensor"""
def __init__(self, operator: str, argument: float):
"""`__init__(...)`: Initialize the Apply module.
Args:
operator: Must be '+', '', '*', '/', or '**'.
Indicates which operation will be done
on the input tensor.
argument: Expected as a float, represents
the rightargument of the operation
(the leftargument being the input
tensor).
"""
nn.Module.__init__(self)
self._operator = str(operator)
assert self._operator in ("+", "", "*", "/", "**")
self._argument = float(argument)
def forward(self, x):
op = self._operator
arg = self._argument
if op == "+":
return x + arg
elif op == "":
return x  arg
elif op == "*":
return x * arg
elif op == "/":
return x / arg
elif op == "**":
return x**arg
else:
raise ValueError("Unknown operator:" + repr(op))
def extra_repr(self):
return "operator={}, argument={}".format(repr(self._operator), self._argument)
__init__(self, operator, argument)
special
¶
__init__(...)
: Initialize the Apply module.
Parameters:
Name  Type  Description  Default 

operator 
str 
Must be '+', '', '', '/', or '*'. Indicates which operation will be done on the input tensor. 
required 
argument 
float 
Expected as a float, represents the rightargument of the operation (the leftargument being the input tensor). 
required 
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, operator: str, argument: float):
"""`__init__(...)`: Initialize the Apply module.
Args:
operator: Must be '+', '', '*', '/', or '**'.
Indicates which operation will be done
on the input tensor.
argument: Expected as a float, represents
the rightargument of the operation
(the leftargument being the input
tensor).
"""
nn.Module.__init__(self)
self._operator = str(operator)
assert self._operator in ("+", "", "*", "/", "**")
self._argument = float(argument)
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
Bin (Module)
¶
A small torch module for binning the values of tensors.
In more details, considering a lower bound value lb, an upper bound value ub, and an input tensor x, each value within x closer to lb will be converted to lb and each value within x closer to ub will be converted to ub.
Source code in evotorch/neuroevolution/net/layers.py
class Bin(nn.Module):
"""A small torch module for binning the values of tensors.
In more details, considering a lower bound value lb,
an upper bound value ub, and an input tensor x,
each value within x closer to lb will be converted to lb
and each value within x closer to ub will be converted to ub.
"""
def __init__(self, lb: float, ub: float):
"""`__init__(...)`: Initialize the Clip operator.
Args:
lb: Lower bound
ub: Upper bound
"""
nn.Module.__init__(self)
self._lb = float(lb)
self._ub = float(ub)
self._interval_size = self._ub  self._lb
self._shrink_amount = self._interval_size / 2.0
self._shift_amount = (self._ub + self._lb) / 2.0
def forward(self, x: torch.Tensor):
x = x  self._shift_amount
x = x / self._shrink_amount
x = torch.sign(x)
x = x * self._shrink_amount
x = x + self._shift_amount
return x
def extra_repr(self):
return "lb={}, ub={}".format(self._lb, self._ub)
__init__(self, lb, ub)
special
¶
__init__(...)
: Initialize the Clip operator.
Parameters:
Name  Type  Description  Default 

lb 
float 
Lower bound 
required 
ub 
float 
Upper bound 
required 
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, lb: float, ub: float):
"""`__init__(...)`: Initialize the Clip operator.
Args:
lb: Lower bound
ub: Upper bound
"""
nn.Module.__init__(self)
self._lb = float(lb)
self._ub = float(ub)
self._interval_size = self._ub  self._lb
self._shrink_amount = self._interval_size / 2.0
self._shift_amount = (self._ub + self._lb) / 2.0
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Clip (Module)
¶
A small torch module for clipping the values of tensors
Source code in evotorch/neuroevolution/net/layers.py
class Clip(nn.Module):
"""A small torch module for clipping the values of tensors"""
def __init__(self, lb: float, ub: float):
"""`__init__(...)`: Initialize the Clip operator.
Args:
lb: Lower bound. Values less than this will be clipped.
ub: Upper bound. Values greater than this will be clipped.
"""
nn.Module.__init__(self)
self._lb = float(lb)
self._ub = float(ub)
def forward(self, x: torch.Tensor):
return x.clamp(self._lb, self._ub)
def extra_repr(self):
return "lb={}, ub={}".format(self._lb, self._ub)
__init__(self, lb, ub)
special
¶
__init__(...)
: Initialize the Clip operator.
Parameters:
Name  Type  Description  Default 

lb 
float 
Lower bound. Values less than this will be clipped. 
required 
ub 
float 
Upper bound. Values greater than this will be clipped. 
required 
Source code in evotorch/neuroevolution/net/layers.py
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
FeedForwardNet (Module)
¶
Representation of a feed forward neural network as a torch Module.
An example initialization of a FeedForwardNet is as follows:
net = drt.FeedForwardNet(4, [(8, 'tanh'), (6, 'tanh')])
which means that we would like to have a network which expects an input vector of length 4 and passes its input through 2 tanhactivated hidden layers (with neurons count 8 and 6, respectively). The output of the last hidden layer (of length 6) is the final output vector.
The string representation of the module obtained via the example above is:
FeedForwardNet(
(layer_0): Linear(in_features=4, out_features=8, bias=True)
(actfunc_0): Tanh()
(layer_1): Linear(in_features=8, out_features=6, bias=True)
(actfunc_1): Tanh()
)
Source code in evotorch/neuroevolution/net/layers.py
class FeedForwardNet(nn.Module):
"""
Representation of a feed forward neural network as a torch Module.
An example initialization of a FeedForwardNet is as follows:
net = drt.FeedForwardNet(4, [(8, 'tanh'), (6, 'tanh')])
which means that we would like to have a network which expects an input
vector of length 4 and passes its input through 2 tanhactivated hidden
layers (with neurons count 8 and 6, respectively).
The output of the last hidden layer (of length 6) is the final
output vector.
The string representation of the module obtained via the example above
is:
FeedForwardNet(
(layer_0): Linear(in_features=4, out_features=8, bias=True)
(actfunc_0): Tanh()
(layer_1): Linear(in_features=8, out_features=6, bias=True)
(actfunc_1): Tanh()
)
"""
LengthActTuple = Tuple[int, Union[str, Callable]]
LengthActBiasTuple = Tuple[int, Union[str, Callable], Union[bool]]
def __init__(self, input_size: int, layers: List[Union[LengthActTuple, LengthActBiasTuple]]):
"""`__init__(...)`: Initialize the FeedForward network.
Args:
input_size: Input size of the network, expected as an int.
layers: Expected as a list of tuples,
where each tuple is either of the form
`(layer_size, activation_function)`
or of the form
`(layer_size, activation_function, bias)`
in which
(i) `layer_size` is an int, specifying the number of neurons;
(ii) `activation_function` is None, or a callable object,
or a string containing the name of the activation function
('relu', 'selu', 'elu', 'tanh', 'hardtanh', or 'sigmoid');
(iii) `bias` is a boolean, specifying whether the layer
is to have a bias or not.
When omitted, bias is set to True.
"""
nn.Module.__init__(self)
for i, layer in enumerate(layers):
if len(layer) == 2:
size, actfunc = layer
bias = True
elif len(layer) == 3:
size, actfunc, bias = layer
else:
assert False, "A layer tuple of invalid size is encountered"
setattr(self, "layer_" + str(i), nn.Linear(input_size, size, bias=bias))
if isinstance(actfunc, str):
if actfunc == "relu":
actfunc = nn.ReLU()
elif actfunc == "selu":
actfunc = nn.SELU()
elif actfunc == "elu":
actfunc = nn.ELU()
elif actfunc == "tanh":
actfunc = nn.Tanh()
elif actfunc == "hardtanh":
actfunc = nn.Hardtanh()
elif actfunc == "sigmoid":
actfunc = nn.Sigmoid()
elif actfunc == "round":
actfunc = Round()
else:
raise ValueError("Unknown activation function: " + repr(actfunc))
setattr(self, "actfunc_" + str(i), actfunc)
input_size = size
def forward(self, x):
i = 0
while hasattr(self, "layer_" + str(i)):
x = getattr(self, "layer_" + str(i))(x)
f = getattr(self, "actfunc_" + str(i))
if f is not None:
x = f(x)
i += 1
return x
__init__(self, input_size, layers)
special
¶
__init__(...)
: Initialize the FeedForward network.
Parameters:
Name  Type  Description  Default 

input_size 
int 
Input size of the network, expected as an int. 
required 
layers 
List[Union[Tuple[int, Union[str, Callable]], Tuple[int, Union[str, Callable], bool]]] 
Expected as a list of tuples,
where each tuple is either of the form

required 
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, input_size: int, layers: List[Union[LengthActTuple, LengthActBiasTuple]]):
"""`__init__(...)`: Initialize the FeedForward network.
Args:
input_size: Input size of the network, expected as an int.
layers: Expected as a list of tuples,
where each tuple is either of the form
`(layer_size, activation_function)`
or of the form
`(layer_size, activation_function, bias)`
in which
(i) `layer_size` is an int, specifying the number of neurons;
(ii) `activation_function` is None, or a callable object,
or a string containing the name of the activation function
('relu', 'selu', 'elu', 'tanh', 'hardtanh', or 'sigmoid');
(iii) `bias` is a boolean, specifying whether the layer
is to have a bias or not.
When omitted, bias is set to True.
"""
nn.Module.__init__(self)
for i, layer in enumerate(layers):
if len(layer) == 2:
size, actfunc = layer
bias = True
elif len(layer) == 3:
size, actfunc, bias = layer
else:
assert False, "A layer tuple of invalid size is encountered"
setattr(self, "layer_" + str(i), nn.Linear(input_size, size, bias=bias))
if isinstance(actfunc, str):
if actfunc == "relu":
actfunc = nn.ReLU()
elif actfunc == "selu":
actfunc = nn.SELU()
elif actfunc == "elu":
actfunc = nn.ELU()
elif actfunc == "tanh":
actfunc = nn.Tanh()
elif actfunc == "hardtanh":
actfunc = nn.Hardtanh()
elif actfunc == "sigmoid":
actfunc = nn.Sigmoid()
elif actfunc == "round":
actfunc = Round()
else:
raise ValueError("Unknown activation function: " + repr(actfunc))
setattr(self, "actfunc_" + str(i), actfunc)
input_size = size
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
LSTM (Module)
¶
Source code in evotorch/neuroevolution/net/layers.py
class LSTM(nn.Module):
def __init__(
self,
input_size: int,
hidden_size: int,
*,
dtype: torch.dtype = torch.float32,
device: Union[str, torch.device] = "cpu",
):
super().__init__()
input_size = int(input_size)
hidden_size = int(hidden_size)
self.input_size = input_size
self.hidden_size = hidden_size
def input_weight():
return nn.Parameter(torch.randn(self.hidden_size, self.input_size, dtype=dtype, device=device))
def weight():
return nn.Parameter(torch.randn(self.hidden_size, self.hidden_size, dtype=dtype, device=device))
def bias():
return nn.Parameter(torch.zeros(self.hidden_size, dtype=dtype, device=device))
self.W_ii = input_weight()
self.W_if = input_weight()
self.W_ig = input_weight()
self.W_io = input_weight()
self.W_hi = weight()
self.W_hf = weight()
self.W_hg = weight()
self.W_ho = weight()
self.b_ii = bias()
self.b_if = bias()
self.b_ig = bias()
self.b_io = bias()
self.b_hi = bias()
self.b_hf = bias()
self.b_hg = bias()
self.b_ho = bias()
def forward(self, x: torch.Tensor, hidden=None) > tuple:
sigm = torch.sigmoid
tanh = torch.tanh
if hidden is None:
h_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
c_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
else:
h_prev, c_prev = hidden
i_t = sigm(self.W_ii @ x + self.b_ii + self.W_hi @ h_prev + self.b_hi)
f_t = sigm(self.W_if @ x + self.b_if + self.W_hf @ h_prev + self.b_hf)
g_t = tanh(self.W_ig @ x + self.b_ig + self.W_hg @ h_prev + self.b_hg)
o_t = sigm(self.W_io @ x + self.b_io + self.W_ho @ h_prev + self.b_ho)
c_t = f_t * c_prev + i_t * g_t
h_t = o_t * tanh(c_t)
return h_t, (h_t, c_t)
def __repr__(self) > str:
clsname = type(self).__name__
return f"{clsname}(input_size={self.input_size}, hidden_size={self.hidden_size})"
forward(self, x, hidden=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x: torch.Tensor, hidden=None) > tuple:
sigm = torch.sigmoid
tanh = torch.tanh
if hidden is None:
h_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
c_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
else:
h_prev, c_prev = hidden
i_t = sigm(self.W_ii @ x + self.b_ii + self.W_hi @ h_prev + self.b_hi)
f_t = sigm(self.W_if @ x + self.b_if + self.W_hf @ h_prev + self.b_hf)
g_t = tanh(self.W_ig @ x + self.b_ig + self.W_hg @ h_prev + self.b_hg)
o_t = sigm(self.W_io @ x + self.b_io + self.W_ho @ h_prev + self.b_ho)
c_t = f_t * c_prev + i_t * g_t
h_t = o_t * tanh(c_t)
return h_t, (h_t, c_t)
LSTMNet (Module)
¶
Source code in evotorch/neuroevolution/net/layers.py
class LSTM(nn.Module):
def __init__(
self,
input_size: int,
hidden_size: int,
*,
dtype: torch.dtype = torch.float32,
device: Union[str, torch.device] = "cpu",
):
super().__init__()
input_size = int(input_size)
hidden_size = int(hidden_size)
self.input_size = input_size
self.hidden_size = hidden_size
def input_weight():
return nn.Parameter(torch.randn(self.hidden_size, self.input_size, dtype=dtype, device=device))
def weight():
return nn.Parameter(torch.randn(self.hidden_size, self.hidden_size, dtype=dtype, device=device))
def bias():
return nn.Parameter(torch.zeros(self.hidden_size, dtype=dtype, device=device))
self.W_ii = input_weight()
self.W_if = input_weight()
self.W_ig = input_weight()
self.W_io = input_weight()
self.W_hi = weight()
self.W_hf = weight()
self.W_hg = weight()
self.W_ho = weight()
self.b_ii = bias()
self.b_if = bias()
self.b_ig = bias()
self.b_io = bias()
self.b_hi = bias()
self.b_hf = bias()
self.b_hg = bias()
self.b_ho = bias()
def forward(self, x: torch.Tensor, hidden=None) > tuple:
sigm = torch.sigmoid
tanh = torch.tanh
if hidden is None:
h_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
c_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
else:
h_prev, c_prev = hidden
i_t = sigm(self.W_ii @ x + self.b_ii + self.W_hi @ h_prev + self.b_hi)
f_t = sigm(self.W_if @ x + self.b_if + self.W_hf @ h_prev + self.b_hf)
g_t = tanh(self.W_ig @ x + self.b_ig + self.W_hg @ h_prev + self.b_hg)
o_t = sigm(self.W_io @ x + self.b_io + self.W_ho @ h_prev + self.b_ho)
c_t = f_t * c_prev + i_t * g_t
h_t = o_t * tanh(c_t)
return h_t, (h_t, c_t)
def __repr__(self) > str:
clsname = type(self).__name__
return f"{clsname}(input_size={self.input_size}, hidden_size={self.hidden_size})"
forward(self, x, hidden=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x: torch.Tensor, hidden=None) > tuple:
sigm = torch.sigmoid
tanh = torch.tanh
if hidden is None:
h_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
c_prev = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
else:
h_prev, c_prev = hidden
i_t = sigm(self.W_ii @ x + self.b_ii + self.W_hi @ h_prev + self.b_hi)
f_t = sigm(self.W_if @ x + self.b_if + self.W_hf @ h_prev + self.b_hf)
g_t = tanh(self.W_ig @ x + self.b_ig + self.W_hg @ h_prev + self.b_hg)
o_t = sigm(self.W_io @ x + self.b_io + self.W_ho @ h_prev + self.b_ho)
c_t = f_t * c_prev + i_t * g_t
h_t = o_t * tanh(c_t)
return h_t, (h_t, c_t)
LocomotorNet (Module)
¶
This is a control network which consists of two components: one linear, and one nonlinear. The nonlinear component is an inputindependent set of sinusoidals waves whose amplitudes, frequencies and phases are trainable. Upon execution of a forward pass, the output of the nonlinear component is the sum of all these sinusoidal waves. The linear component is a linear layer (optionally with bias) whose weights (and biases) are trainable. The final output of the LocomotorNet at the end of a forward pass is the sum of the linear and the nonlinear components.
Note that this is a stateful network, where the only state
is the timestep t, which starts from 0 and gets incremented by 1
at the end of each forward pass. The reset()
method resets
t back to 0.
Reference
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018). Structured Control Nets for Deep Reinforcement Learning.
Source code in evotorch/neuroevolution/net/layers.py
class LocomotorNet(nn.Module):
"""LocomotorNet: A locomotionspecific structured control net.
This is a control network which consists of two components:
one linear, and one nonlinear. The nonlinear component
is an inputindependent set of sinusoidals waves whose
amplitudes, frequencies and phases are trainable.
Upon execution of a forward pass, the output of the nonlinear
component is the sum of all these sinusoidal waves.
The linear component is a linear layer (optionally with bias)
whose weights (and biases) are trainable.
The final output of the LocomotorNet at the end of a forward pass
is the sum of the linear and the nonlinear components.
Note that this is a stateful network, where the only state
is the timestep t, which starts from 0 and gets incremented by 1
at the end of each forward pass. The `reset()` method resets
t back to 0.
Reference:
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018).
Structured Control Nets for Deep Reinforcement Learning.
"""
def __init__(self, *, in_features: int, out_features: int, bias: bool = True, num_sinusoids=16):
"""`__init__(...)`: Initialize the LocomotorNet.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
bias: Whether or not the linear component is to have a bias
num_sinusoids: Number of sinusoidal waves
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._bias = bias
self._num_sinusoids = num_sinusoids
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._amplitudes = nn.ParameterList()
self._frequencies = nn.ParameterList()
self._phases = nn.ParameterList()
for _ in range(self._num_sinusoids):
for paramlist in (self._amplitudes, self._frequencies, self._phases):
paramlist.append(nn.Parameter(torch.randn(self._out_features, dtype=torch.float32)))
self.reset()
def reset(self):
"""Set the timestep t to 0"""
self._t = 0
@property
def t(self) > int:
"""The current timestep t"""
return self._t
@property
def in_features(self) > int:
"""Get the length of the input vector"""
return self._in_features
@property
def out_features(self) > int:
"""Get the length of the output vector"""
return self._out_features
@property
def num_sinusoids(self) > int:
"""Get the number of sinusoidal waves of the nonlinear component"""
return self._num_sinusoids
@property
def bias(self) > bool:
"""Get whether or not the linear component has bias"""
return self._bias
def forward(self, x: torch.Tensor) > torch.Tensor:
"""Execute a forward pass"""
u_linear = self._linear_component(x)
t = self._t
u_nonlinear = torch.zeros(self._out_features)
for i in range(self._num_sinusoids):
A = self._amplitudes[i]
w = self._frequencies[i]
phi = self._phases[i]
u_nonlinear = u_nonlinear + (A * torch.sin(w * t + phi))
self._t += 1
return u_linear + u_nonlinear
bias: bool
property
readonly
¶
Get whether or not the linear component has bias
in_features: int
property
readonly
¶
Get the length of the input vector
num_sinusoids: int
property
readonly
¶
Get the number of sinusoidal waves of the nonlinear component
out_features: int
property
readonly
¶
Get the length of the output vector
t: int
property
readonly
¶
The current timestep t
__init__(self, *, in_features, out_features, bias=True, num_sinusoids=16)
special
¶
__init__(...)
: Initialize the LocomotorNet.
Parameters:
Name  Type  Description  Default 

in_features 
int 
Length of the input vector 
required 
out_features 
int 
Length of the output vector 
required 
bias 
bool 
Whether or not the linear component is to have a bias 
True 
num_sinusoids 
Number of sinusoidal waves 
16 
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, *, in_features: int, out_features: int, bias: bool = True, num_sinusoids=16):
"""`__init__(...)`: Initialize the LocomotorNet.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
bias: Whether or not the linear component is to have a bias
num_sinusoids: Number of sinusoidal waves
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._bias = bias
self._num_sinusoids = num_sinusoids
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._amplitudes = nn.ParameterList()
self._frequencies = nn.ParameterList()
self._phases = nn.ParameterList()
for _ in range(self._num_sinusoids):
for paramlist in (self._amplitudes, self._frequencies, self._phases):
paramlist.append(nn.Parameter(torch.randn(self._out_features, dtype=torch.float32)))
self.reset()
forward(self, x)
¶
Execute a forward pass
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x: torch.Tensor) > torch.Tensor:
"""Execute a forward pass"""
u_linear = self._linear_component(x)
t = self._t
u_nonlinear = torch.zeros(self._out_features)
for i in range(self._num_sinusoids):
A = self._amplitudes[i]
w = self._frequencies[i]
phi = self._phases[i]
u_nonlinear = u_nonlinear + (A * torch.sin(w * t + phi))
self._t += 1
return u_linear + u_nonlinear
reset(self)
¶
RNN (Module)
¶
Source code in evotorch/neuroevolution/net/layers.py
class RNN(nn.Module):
def __init__(
self,
input_size: int,
hidden_size: int,
nonlinearity: str = "tanh",
*,
dtype: torch.dtype = torch.float32,
device: Union[str, torch.device] = "cpu",
):
super().__init__()
input_size = int(input_size)
hidden_size = int(hidden_size)
nonlinearity = str(nonlinearity)
self.W1 = nn.Parameter(torch.randn(hidden_size, input_size, dtype=dtype, device=device))
self.W2 = nn.Parameter(torch.randn(hidden_size, hidden_size, dtype=dtype, device=device))
self.b1 = nn.Parameter(torch.zeros(hidden_size, dtype=dtype, device=device))
self.b2 = nn.Parameter(torch.zeros(hidden_size, dtype=dtype, device=device))
if nonlinearity == "tanh":
self.actfunc = torch.tanh
else:
self.actfunc = getattr(nnf, nonlinearity)
self.nonlinearity = nonlinearity
self.input_size = input_size
self.hidden_size = hidden_size
def forward(self, x: torch.Tensor, h: Optional[torch.Tensor] = None) > tuple:
if h is None:
h = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
act = self.actfunc
W1 = self.W1
W2 = self.W2
b1 = self.b1.unsqueeze(1)
b2 = self.b2.unsqueeze(1)
x = x.unsqueeze(1)
h = h.unsqueeze(1)
y = act(((W1 @ x) + b1) + ((W2 @ h) + b2))
y = y.squeeze(1)
return y, y
def __repr__(self) > str:
clsname = type(self).__name__
return f"{clsname}(input_size={self.input_size}, hidden_size={self.hidden_size}, nonlinearity={repr(self.nonlinearity)})"
forward(self, x, h=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x: torch.Tensor, h: Optional[torch.Tensor] = None) > tuple:
if h is None:
h = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
act = self.actfunc
W1 = self.W1
W2 = self.W2
b1 = self.b1.unsqueeze(1)
b2 = self.b2.unsqueeze(1)
x = x.unsqueeze(1)
h = h.unsqueeze(1)
y = act(((W1 @ x) + b1) + ((W2 @ h) + b2))
y = y.squeeze(1)
return y, y
RecurrentNet (Module)
¶
Source code in evotorch/neuroevolution/net/layers.py
class RNN(nn.Module):
def __init__(
self,
input_size: int,
hidden_size: int,
nonlinearity: str = "tanh",
*,
dtype: torch.dtype = torch.float32,
device: Union[str, torch.device] = "cpu",
):
super().__init__()
input_size = int(input_size)
hidden_size = int(hidden_size)
nonlinearity = str(nonlinearity)
self.W1 = nn.Parameter(torch.randn(hidden_size, input_size, dtype=dtype, device=device))
self.W2 = nn.Parameter(torch.randn(hidden_size, hidden_size, dtype=dtype, device=device))
self.b1 = nn.Parameter(torch.zeros(hidden_size, dtype=dtype, device=device))
self.b2 = nn.Parameter(torch.zeros(hidden_size, dtype=dtype, device=device))
if nonlinearity == "tanh":
self.actfunc = torch.tanh
else:
self.actfunc = getattr(nnf, nonlinearity)
self.nonlinearity = nonlinearity
self.input_size = input_size
self.hidden_size = hidden_size
def forward(self, x: torch.Tensor, h: Optional[torch.Tensor] = None) > tuple:
if h is None:
h = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
act = self.actfunc
W1 = self.W1
W2 = self.W2
b1 = self.b1.unsqueeze(1)
b2 = self.b2.unsqueeze(1)
x = x.unsqueeze(1)
h = h.unsqueeze(1)
y = act(((W1 @ x) + b1) + ((W2 @ h) + b2))
y = y.squeeze(1)
return y, y
def __repr__(self) > str:
clsname = type(self).__name__
return f"{clsname}(input_size={self.input_size}, hidden_size={self.hidden_size}, nonlinearity={repr(self.nonlinearity)})"
forward(self, x, h=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x: torch.Tensor, h: Optional[torch.Tensor] = None) > tuple:
if h is None:
h = torch.zeros(self.hidden_size, dtype=x.dtype, device=x.device)
act = self.actfunc
W1 = self.W1
W2 = self.W2
b1 = self.b1.unsqueeze(1)
b2 = self.b2.unsqueeze(1)
x = x.unsqueeze(1)
h = h.unsqueeze(1)
y = act(((W1 @ x) + b1) + ((W2 @ h) + b2))
y = y.squeeze(1)
return y, y
Round (Module)
¶
A small torch module for rounding the values of an input tensor
Source code in evotorch/neuroevolution/net/layers.py
class Round(nn.Module):
"""A small torch module for rounding the values of an input tensor"""
def __init__(self, ndigits: int = 0):
nn.Module.__init__(self)
self._ndigits = int(ndigits)
self._q = 10.0**self._ndigits
def forward(self, x):
x = x * self._q
x = torch.round(x)
x = x / self._q
return x
def extra_repr(self):
return "ndigits=" + str(self._ndigits)
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Slice (Module)
¶
A small torch module for getting the slice of an input tensor
Source code in evotorch/neuroevolution/net/layers.py
class Slice(nn.Module):
"""A small torch module for getting the slice of an input tensor"""
def __init__(self, from_index: int, to_index: int):
"""`__init__(...)`: Initialize the Slice operator.
Args:
from_index: The index from which the slice begins.
to_index: The exclusive index at which the slice ends.
"""
nn.Module.__init__(self)
self._from_index = from_index
self._to_index = to_index
def forward(self, x):
return x[self._from_index : self._to_index]
def extra_repr(self):
return "from_index={}, to_index={}".format(self._from_index, self._to_index)
__init__(self, from_index, to_index)
special
¶
__init__(...)
: Initialize the Slice operator.
Parameters:
Name  Type  Description  Default 

from_index 
int 
The index from which the slice begins. 
required 
to_index 
int 
The exclusive index at which the slice ends. 
required 
Source code in evotorch/neuroevolution/net/layers.py
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
StructuredControlNet (Module)
¶
Structured Control Net.
This is a control network consisting of two components: (i) a nonlinear component, which is a feedforward network; and (ii) a linear component, which is a linear layer. Both components take the input vector provided to the structured control network. The final output is the sum of the outputs of both components.
Reference
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018). Structured Control Nets for Deep Reinforcement Learning.
Source code in evotorch/neuroevolution/net/layers.py
class StructuredControlNet(nn.Module):
"""Structured Control Net.
This is a control network consisting of two components:
(i) a nonlinear component, which is a feedforward network; and
(ii) a linear component, which is a linear layer.
Both components take the input vector provided to the
structured control network.
The final output is the sum of the outputs of both components.
Reference:
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018).
Structured Control Nets for Deep Reinforcement Learning.
"""
def __init__(
self,
*,
in_features: int,
out_features: int,
num_layers: int,
hidden_size: int,
bias: bool = True,
nonlinearity: Union[str, Callable] = "tanh",
):
"""`__init__(...)`: Initialize the structured control net.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
num_layers: Number of hidden layers for the nonlinear component
hidden_size: Number of neurons in a hidden layer of the
nonlinear component
bias: Whether or not the linear component is to have bias
nonlinearity: Activation function
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._num_layers = num_layers
self._hidden_size = hidden_size
self._bias = bias
self._nonlinearity = nonlinearity
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._nonlinear_component = FeedForwardNet(
input_size=self._in_features,
layers=(
list((self._hidden_size, self._nonlinearity) for _ in range(self._num_layers))
+ [(self._out_features, self._nonlinearity)]
),
)
def forward(self, x: torch.Tensor) > torch.Tensor:
"""TODO: documentation"""
return self._linear_component(x) + self._nonlinear_component(x)
@property
def in_features(self):
"""TODO: documentation"""
return self._in_features
@property
def out_features(self):
"""TODO: documentation"""
return self._out_features
@property
def num_layers(self):
"""TODO: documentation"""
return self._num_layers
@property
def hidden_size(self):
"""TODO: documentation"""
return self._hidden_size
@property
def bias(self):
"""TODO: documentation"""
return self._bias
@property
def nonlinearity(self):
"""TODO: documentation"""
return self._nonlinearity
bias
property
readonly
¶
hidden_size
property
readonly
¶
in_features
property
readonly
¶
nonlinearity
property
readonly
¶
num_layers
property
readonly
¶
out_features
property
readonly
¶
__init__(self, *, in_features, out_features, num_layers, hidden_size, bias=True, nonlinearity='tanh')
special
¶
__init__(...)
: Initialize the structured control net.
Parameters:
Name  Type  Description  Default 

in_features 
int 
Length of the input vector 
required 
out_features 
int 
Length of the output vector 
required 
num_layers 
int 
Number of hidden layers for the nonlinear component 
required 
hidden_size 
int 
Number of neurons in a hidden layer of the nonlinear component 
required 
bias 
bool 
Whether or not the linear component is to have bias 
True 
nonlinearity 
Union[str, Callable] 
Activation function 
'tanh' 
Source code in evotorch/neuroevolution/net/layers.py
def __init__(
self,
*,
in_features: int,
out_features: int,
num_layers: int,
hidden_size: int,
bias: bool = True,
nonlinearity: Union[str, Callable] = "tanh",
):
"""`__init__(...)`: Initialize the structured control net.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
num_layers: Number of hidden layers for the nonlinear component
hidden_size: Number of neurons in a hidden layer of the
nonlinear component
bias: Whether or not the linear component is to have bias
nonlinearity: Activation function
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._num_layers = num_layers
self._hidden_size = hidden_size
self._bias = bias
self._nonlinearity = nonlinearity
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._nonlinear_component = FeedForwardNet(
input_size=self._in_features,
layers=(
list((self._hidden_size, self._nonlinearity) for _ in range(self._num_layers))
+ [(self._out_features, self._nonlinearity)]
),
)
forward(self, x)
¶
misc
¶
Utilities for reading and for writing neural network parameters
count_parameters(net)
¶
Get the number of parameters the network.
Parameters:
Name  Type  Description  Default 

net 
Module 
The torch module whose parameters will be counted. 
required 
Returns:
Type  Description 

int 
The number of parameters, as an integer. 
Source code in evotorch/neuroevolution/net/misc.py
device_of_module(m, default=None)
¶
Get the device in which the module exists.
This function looks at the first parameter of the module, and returns its device. This function is not meant to be used on modules whose parameters exist on different devices.
Parameters:
Name  Type  Description  Default 

m 
Module 
The module whose device is being queried. 
required 
default 
Union[str, torch.device] 
The fallback device to return if the module has no parameters. If this is left as None, the fallback device is assumed to be "cpu". 
None 
Returns:
Type  Description 

device 
The device of the module, determined from its first parameter. 
Source code in evotorch/neuroevolution/net/misc.py
def device_of_module(m: nn.Module, default: Optional[Union[str, torch.device]] = None) > torch.device:
"""
Get the device in which the module exists.
This function looks at the first parameter of the module, and returns
its device. This function is not meant to be used on modules whose
parameters exist on different devices.
Args:
m: The module whose device is being queried.
default: The fallback device to return if the module has no
parameters. If this is left as None, the fallback device
is assumed to be "cpu".
Returns:
The device of the module, determined from its first parameter.
"""
if default is None:
default = torch.device("cpu")
device = default
for p in m.parameters():
device = p.device
break
return device
fill_parameters(net, vector)
¶
Fill the parameters of a torch module (net) from a vector.
No gradient information is kept.
The vector's length must be exactly the same with the number of parameters of the PyTorch module.
Parameters:
Name  Type  Description  Default 

net 
Module 
The torch module whose parameter values will be filled. 
required 
vector 
Tensor 
A 1D torch tensor which stores the parameter values. 
required 
Source code in evotorch/neuroevolution/net/misc.py
@torch.no_grad()
def fill_parameters(net: nn.Module, vector: torch.Tensor):
"""Fill the parameters of a torch module (net) from a vector.
No gradient information is kept.
The vector's length must be exactly the same with the number
of parameters of the PyTorch module.
Args:
net: The torch module whose parameter values will be filled.
vector: A 1D torch tensor which stores the parameter values.
"""
address = 0
for p in net.parameters():
d = p.data.view(1)
n = len(d)
d[:] = torch.as_tensor(vector[address : address + n], device=d.device)
address += n
if address != len(vector):
raise IndexError("The parameter vector is larger than expected")
parameter_vector(net, *, device=None)
¶
Get all the parameters of a torch module (net) into a vector
No gradient information is kept.
Parameters:
Name  Type  Description  Default 

net 
Module 
The torch module whose parameters will be extracted. 
required 
device 
Union[str, torch.device] 
The device in which the parameter vector will be constructed. If the network has parameter across multiple devices, you can specify this argument so that concatenation of all the parameters will be successful. 
None 
Returns:
Type  Description 

Tensor 
The parameters of the module in a 1D tensor. 
Source code in evotorch/neuroevolution/net/misc.py
@torch.no_grad()
def parameter_vector(net: nn.Module, *, device: Optional[Device] = None) > torch.Tensor:
"""Get all the parameters of a torch module (net) into a vector
No gradient information is kept.
Args:
net: The torch module whose parameters will be extracted.
device: The device in which the parameter vector will be constructed.
If the network has parameter across multiple devices,
you can specify this argument so that concatenation of all the
parameters will be successful.
Returns:
The parameters of the module in a 1D tensor.
"""
dev_kwarg = {} if device is None else {"device": device}
all_vectors = []
for p in net.parameters():
all_vectors.append(torch.as_tensor(p.data.view(1), **dev_kwarg))
return torch.cat(all_vectors)
multilayered
¶
MultiLayered (Module)
¶
Source code in evotorch/neuroevolution/net/multilayered.py
class MultiLayered(nn.Module):
def __init__(self, *layers: nn.Module):
super().__init__()
self._submodules = nn.ModuleList(layers)
def forward(self, x: torch.Tensor, h: Optional[dict] = None):
if h is None:
h = {}
new_h = {}
for i, layer in enumerate(self._submodules):
layer_h = h.get(i, None)
if layer_h is None:
layer_result = layer(x)
else:
layer_result = layer(x, h[i])
if isinstance(layer_result, tuple):
if len(layer_result) == 2:
x, layer_new_h = layer_result
else:
raise ValueError(
f"The layer number {i} returned a tuple of length {len(layer_result)}."
f" A tensor or a tuple of two elements was expected."
)
elif isinstance(layer_result, torch.Tensor):
x = layer_result
layer_new_h = None
else:
raise TypeError(
f"The layer number {i} returned an object of type {type(layer_result)}."
f" A tensor or a tuple of two elements was expected."
)
if layer_new_h is not None:
new_h[i] = layer_new_h
if len(new_h) == 0:
return x
else:
return x, new_h
def __iter__(self):
return self._submodules.__iter__()
def __getitem__(self, i):
return self._submodules[i]
def __len__(self):
return len(self._submodules)
def append(self, module: nn.Module):
self._submodules.append(module)
forward(self, x, h=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/multilayered.py
def forward(self, x: torch.Tensor, h: Optional[dict] = None):
if h is None:
h = {}
new_h = {}
for i, layer in enumerate(self._submodules):
layer_h = h.get(i, None)
if layer_h is None:
layer_result = layer(x)
else:
layer_result = layer(x, h[i])
if isinstance(layer_result, tuple):
if len(layer_result) == 2:
x, layer_new_h = layer_result
else:
raise ValueError(
f"The layer number {i} returned a tuple of length {len(layer_result)}."
f" A tensor or a tuple of two elements was expected."
)
elif isinstance(layer_result, torch.Tensor):
x = layer_result
layer_new_h = None
else:
raise TypeError(
f"The layer number {i} returned an object of type {type(layer_result)}."
f" A tensor or a tuple of two elements was expected."
)
if layer_new_h is not None:
new_h[i] = layer_new_h
if len(new_h) == 0:
return x
else:
return x, new_h
parser
¶
Utilities for parsing string representations of neural net policies
NetParsingError (Exception)
¶
Representation of a parsing error
Source code in evotorch/neuroevolution/net/parser.py
class NetParsingError(Exception):
"""
Representation of a parsing error
"""
def __init__(
self,
message: str,
lineno: Optional[int] = None,
col_offset: Optional[int] = None,
original_error: Optional[Exception] = None,
):
"""
`__init__(...)`: Initialize the NetParsingError.
Args:
message: Error message, as string.
lineno: Erroneous line number in the string representation of the
neural network structure.
col_offset: Erroneous column number in the string representation
of the neural network structure.
original_error: If another error caused this parsing error,
that original error can be attached to this `NetParsingError`
instance via this argument.
"""
super().__init__()
self.message = message
self.lineno = lineno
self.col_offset = col_offset
self.original_error = original_error
def _to_string(self) > str:
parts = []
parts.append(type(self).__name__)
if self.lineno is not None:
parts.append(" at line(")
parts.append(str(self.lineno  1))
parts.append(")")
if self.col_offset is not None:
parts.append(" at column(")
parts.append(str(self.col_offset + 1))
parts.append(")")
parts.append(": ")
parts.append(self.message)
return "".join(parts)
def __str__(self) > str:
return self._to_string()
def __repr__(self) > str:
return self._to_string()
__init__(self, message, lineno=None, col_offset=None, original_error=None)
special
¶
__init__(...)
: Initialize the NetParsingError.
Parameters:
Name  Type  Description  Default 

message 
str 
Error message, as string. 
required 
lineno 
Optional[int] 
Erroneous line number in the string representation of the neural network structure. 
None 
col_offset 
Optional[int] 
Erroneous column number in the string representation of the neural network structure. 
None 
original_error 
Optional[Exception] 
If another error caused this parsing error,
that original error can be attached to this 
None 
Source code in evotorch/neuroevolution/net/parser.py
def __init__(
self,
message: str,
lineno: Optional[int] = None,
col_offset: Optional[int] = None,
original_error: Optional[Exception] = None,
):
"""
`__init__(...)`: Initialize the NetParsingError.
Args:
message: Error message, as string.
lineno: Erroneous line number in the string representation of the
neural network structure.
col_offset: Erroneous column number in the string representation
of the neural network structure.
original_error: If another error caused this parsing error,
that original error can be attached to this `NetParsingError`
instance via this argument.
"""
super().__init__()
self.message = message
self.lineno = lineno
self.col_offset = col_offset
self.original_error = original_error
str_to_net(s, **constants)
¶
Read a string representation of a neural net structure,
and return a torch.nn.Module
instance out of it.
Let us imagine that one wants to describe the following neural network structure:
from torch import nn
from evotorch.neuroevolution.net import MultiLayered
net = MultiLayered(nn.Linear(8, 16), nn.Tanh(), nn.Linear(16, 4, bias=False), nn.ReLU())
By using str_to_net(...)
one can construct an equivalent
module via:
from evotorch.neuroevolution.net import str_to_net
net = str_to_net("Linear(8, 16) >> Tanh() >> Linear(16, 4, bias=False) >> ReLU()")
The string can also be multiline:
One can also define constants for using them in strings:
net = str_to_net(
'''
Linear(input_size, hidden_size)
>> Tanh()
>> Linear(hidden_size, output_size, bias=False)
>> ReLU()
''',
input_size=8,
hidden_size=16,
output_size=4,
)
In the neural net structure string, when one refers to a module type,
say, Linear
, first the name Linear
is searched for in the namespace
evotorch.neuroevolution.net.layers
, and then in the namespace torch.nn
.
In the case of Linear
, the searched name exists in torch.nn
,
and therefore, the layer type to be instantiated is accepted as
torch.nn.Linear
.
Instead of Linear
, if one had used the name, say,
StructuredControlNet
, then, the layer type to be instantiated
would be evotorch.neuroevolution.net.layers.StructuredControlNet
.
The namespace evotorch.neuroevolution.net.layers
contains its own
implementations for RNN and LSTM. These recurrent layer implementations
work similarly to their counterparts torch.nn.RNN
and torch.nn.LSTM
,
except that EvoTorch's implementations do not expect the data with extra
leftmost dimensions for batching and for timesteps. Instead, they expect
to receive a single input and a single current hidden state, and produce
a single output and a single new hidden state. These recurrent layer
implementations of EvoTorch can be used within a neural net structure
string. Therefore, the following examples are valid:
rnn1 = str_to_net("RNN(4, 8) >> Linear(8, 2)")
rnn2 = str_to_net(
'''
Linear(4, 10)
>> Tanh()
>> RNN(input_size=10, hidden_size=24, nonlinearity='tanh'
>> Linear(24, 2)
'''
)
lstm1 = str_to_net("LSTM(4, 32) >> Linear(32, 2)")
lstm2 = str_to_net("LSTM(input_size=4, hidden_size=32) >> Linear(32, 2)")
Notes regarding usage with evotorch.neuroevolution.GymNE
or with evotorch.neuroevolution.VecGymNE
:
While instantiating a GymNE
or a VecGymNE
, one can specify a neural
net structure string as the policy. Therefore, while filling the policy
string for a GymNE
, all these rules mentioned above apply. Additionally,
while using str_to_net(...)
internally, GymNE
and VecGymNE
define
these extra constants:
obs_length
(length of the observation vector),
act_length
(length of the action vector for continuousaction
environments, or number of actions for discreteaction
environments), and
obs_shape
(shape of the observation as a tuple, assuming that the
observation space is of type gym.spaces.Box
, usable within the string
like obs_shape[0]
, obs_shape[1]
, etc., or simply obs_shape
to refer
to the entire tuple).
Therefore, while instantiating a GymNE
or a VecGymNE
, one can define a
singlehiddenlayered policy via this string:
In the policy string above, one might choose to omit the last Tanh()
, as
GymNE
and VecGymNE
will clip the final output of the policy to conform
to the action boundaries defined by the target reinforcement learning
environment, and such a clipping operation might be seen as using an
activation function similar to hardtanh anyway.
Parameters:
Name  Type  Description  Default 

s 
str 
The string which expresses the neural net structure. 
required 
Returns:
Type  Description 

Module 
The PyTorch module of the specified structure. 
Source code in evotorch/neuroevolution/net/parser.py
def str_to_net(s: str, **constants) > nn.Module:
"""
Read a string representation of a neural net structure,
and return a `torch.nn.Module` instance out of it.
Let us imagine that one wants to describe the following
neural network structure:
```python
from torch import nn
from evotorch.neuroevolution.net import MultiLayered
net = MultiLayered(nn.Linear(8, 16), nn.Tanh(), nn.Linear(16, 4, bias=False), nn.ReLU())
```
By using `str_to_net(...)` one can construct an equivalent
module via:
```python
from evotorch.neuroevolution.net import str_to_net
net = str_to_net("Linear(8, 16) >> Tanh() >> Linear(16, 4, bias=False) >> ReLU()")
```
The string can also be multiline:
```python
net = str_to_net(
'''
Linear(8, 16)
>> Tanh()
>> Linear(16, 4, bias=False)
>> ReLU()
'''
)
```
One can also define constants for using them in strings:
```python
net = str_to_net(
'''
Linear(input_size, hidden_size)
>> Tanh()
>> Linear(hidden_size, output_size, bias=False)
>> ReLU()
''',
input_size=8,
hidden_size=16,
output_size=4,
)
```
In the neural net structure string, when one refers to a module type,
say, `Linear`, first the name `Linear` is searched for in the namespace
`evotorch.neuroevolution.net.layers`, and then in the namespace `torch.nn`.
In the case of `Linear`, the searched name exists in `torch.nn`,
and therefore, the layer type to be instantiated is accepted as
`torch.nn.Linear`.
Instead of `Linear`, if one had used the name, say,
`StructuredControlNet`, then, the layer type to be instantiated
would be `evotorch.neuroevolution.net.layers.StructuredControlNet`.
The namespace `evotorch.neuroevolution.net.layers` contains its own
implementations for RNN and LSTM. These recurrent layer implementations
work similarly to their counterparts `torch.nn.RNN` and `torch.nn.LSTM`,
except that EvoTorch's implementations do not expect the data with extra
leftmost dimensions for batching and for timesteps. Instead, they expect
to receive a single input and a single current hidden state, and produce
a single output and a single new hidden state. These recurrent layer
implementations of EvoTorch can be used within a neural net structure
string. Therefore, the following examples are valid:
```python
rnn1 = str_to_net("RNN(4, 8) >> Linear(8, 2)")
rnn2 = str_to_net(
'''
Linear(4, 10)
>> Tanh()
>> RNN(input_size=10, hidden_size=24, nonlinearity='tanh'
>> Linear(24, 2)
'''
)
lstm1 = str_to_net("LSTM(4, 32) >> Linear(32, 2)")
lstm2 = str_to_net("LSTM(input_size=4, hidden_size=32) >> Linear(32, 2)")
```
**Notes regarding usage with `evotorch.neuroevolution.GymNE`
or with `evotorch.neuroevolution.VecGymNE`:**
While instantiating a `GymNE` or a `VecGymNE`, one can specify a neural
net structure string as the policy. Therefore, while filling the policy
string for a `GymNE`, all these rules mentioned above apply. Additionally,
while using `str_to_net(...)` internally, `GymNE` and `VecGymNE` define
these extra constants:
`obs_length` (length of the observation vector),
`act_length` (length of the action vector for continuousaction
environments, or number of actions for discreteaction
environments), and
`obs_shape` (shape of the observation as a tuple, assuming that the
observation space is of type `gym.spaces.Box`, usable within the string
like `obs_shape[0]`, `obs_shape[1]`, etc., or simply `obs_shape` to refer
to the entire tuple).
Therefore, while instantiating a `GymNE` or a `VecGymNE`, one can define a
singlehiddenlayered policy via this string:
```
"Linear(obs_length, 16) >> Tanh() >> Linear(16, act_length) >> Tanh()"
```
In the policy string above, one might choose to omit the last `Tanh()`, as
`GymNE` and `VecGymNE` will clip the final output of the policy to conform
to the action boundaries defined by the target reinforcement learning
environment, and such a clipping operation might be seen as using an
activation function similar to hardtanh anyway.
Args:
s: The string which expresses the neural net structure.
Returns:
The PyTorch module of the specified structure.
"""
s = f"(\n{s}\n)"
return _process_expr(ast.parse(s, mode="eval").body, constants=constants)
rl
¶
This namespace provides various reinforcement learning utilities.
ActClipWrapperModule (Module)
¶
Source code in evotorch/neuroevolution/net/rl.py
class ActClipWrapperModule(nn.Module):
def __init__(self, wrapped_module: nn.Module, obs_space: Box):
super().__init__()
device = device_of_module(wrapped_module)
if not isinstance(obs_space, Box):
raise TypeError(f"Unrecognized observation space: {obs_space}")
self.wrapped_module = wrapped_module
self.register_buffer("_low", torch.from_numpy(obs_space.low).to(device))
self.register_buffer("_high", torch.from_numpy(obs_space.high).to(device))
def forward(self, x: torch.Tensor, h: Any = None) > Union[torch.Tensor, tuple]:
if h is None:
result = self.wrapped_module(x)
else:
result = self.wrapped_module(x, h)
if isinstance(result, tuple):
x, h = result
got_h = True
else:
x = result
h = None
got_h = False
x = torch.max(x, self._low)
x = torch.min(x, self._high)
if got_h:
return x, h
else:
return x
forward(self, x, h=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/rl.py
def forward(self, x: torch.Tensor, h: Any = None) > Union[torch.Tensor, tuple]:
if h is None:
result = self.wrapped_module(x)
else:
result = self.wrapped_module(x, h)
if isinstance(result, tuple):
x, h = result
got_h = True
else:
x = result
h = None
got_h = False
x = torch.max(x, self._low)
x = torch.min(x, self._high)
if got_h:
return x, h
else:
return x
AliveBonusScheduleWrapper (Wrapper)
¶
A Wrapper which awards the agent for being alive in a scheduled manner This wrapper is meant to be used for nonvectorized environments.
Source code in evotorch/neuroevolution/net/rl.py
class AliveBonusScheduleWrapper(gym.Wrapper):
"""
A Wrapper which awards the agent for being alive in a scheduled manner
This wrapper is meant to be used for nonvectorized environments.
"""
def __init__(self, env: gym.Env, alive_bonus_schedule: tuple, **kwargs):
"""
`__init__(...)`: Initialize the AliveBonusScheduleWrapper.
Args:
env: Environment to wrap.
alive_bonus_schedule: If given as a tuple `(t, b)`, an alive
bonus `b` will be added onto all the rewards beyond the
timestep `t`.
If given as a tuple `(t0, t1, b)`, a partial (linearly
increasing towards `b`) alive bonus will be added onto
all the rewards between the timesteps `t0` and `t1`,
and a full alive bonus (which equals to `b`) will be added
onto all the rewards beyond the timestep `t1`.
kwargs: Expected in the form of additional keyword arguments,
these will be passed to the initialization method of the
superclass.
"""
super().__init__(env, **kwargs)
self.__t: Optional[int] = None
if len(alive_bonus_schedule) == 3:
self.__t0, self.__t1, self.__bonus = (
int(alive_bonus_schedule[0]),
int(alive_bonus_schedule[1]),
float(alive_bonus_schedule[2]),
)
elif len(alive_bonus_schedule) == 2:
self.__t0, self.__t1, self.__bonus = (
int(alive_bonus_schedule[0]),
int(alive_bonus_schedule[0]),
float(alive_bonus_schedule[1]),
)
else:
raise ValueError(
f"The argument `alive_bonus_schedule` was expected to have 2 or 3 elements."
f" However, its value is {repr(alive_bonus_schedule)} (having {len(alive_bonus_schedule)} elements)."
)
if self.__t1 > self.__t0:
self.__gap = self.__t1  self.__t0
else:
self.__gap = None
def reset(self, *args, **kwargs):
self.__t = 0
return self.env.reset(*args, **kwargs)
def step(self, action) > tuple:
step_result = self.env.step(action)
self.__t += 1
observation = step_result[0]
reward = step_result[1]
rest = step_result[2:]
if self.__t >= self.__t1:
reward = reward + self.__bonus
elif (self.__gap is not None) and (self.__t >= self.__t0):
reward = reward + ((self.__t  self.__t0) / self.__gap) * self.__bonus
return (observation, reward) + rest
__init__(self, env, alive_bonus_schedule, **kwargs)
special
¶
__init__(...)
: Initialize the AliveBonusScheduleWrapper.
Parameters:
Name  Type  Description  Default 

env 
Env 
Environment to wrap. 
required 
alive_bonus_schedule 
tuple 
If given as a tuple 
required 
kwargs 
Expected in the form of additional keyword arguments, these will be passed to the initialization method of the superclass. 
{} 
Source code in evotorch/neuroevolution/net/rl.py
def __init__(self, env: gym.Env, alive_bonus_schedule: tuple, **kwargs):
"""
`__init__(...)`: Initialize the AliveBonusScheduleWrapper.
Args:
env: Environment to wrap.
alive_bonus_schedule: If given as a tuple `(t, b)`, an alive
bonus `b` will be added onto all the rewards beyond the
timestep `t`.
If given as a tuple `(t0, t1, b)`, a partial (linearly
increasing towards `b`) alive bonus will be added onto
all the rewards between the timesteps `t0` and `t1`,
and a full alive bonus (which equals to `b`) will be added
onto all the rewards beyond the timestep `t1`.
kwargs: Expected in the form of additional keyword arguments,
these will be passed to the initialization method of the
superclass.
"""
super().__init__(env, **kwargs)
self.__t: Optional[int] = None
if len(alive_bonus_schedule) == 3:
self.__t0, self.__t1, self.__bonus = (
int(alive_bonus_schedule[0]),
int(alive_bonus_schedule[1]),
float(alive_bonus_schedule[2]),
)
elif len(alive_bonus_schedule) == 2:
self.__t0, self.__t1, self.__bonus = (
int(alive_bonus_schedule[0]),
int(alive_bonus_schedule[0]),
float(alive_bonus_schedule[1]),
)
else:
raise ValueError(
f"The argument `alive_bonus_schedule` was expected to have 2 or 3 elements."
f" However, its value is {repr(alive_bonus_schedule)} (having {len(alive_bonus_schedule)} elements)."
)
if self.__t1 > self.__t0:
self.__gap = self.__t1  self.__t0
else:
self.__gap = None
reset(self, *args, **kwargs)
¶
step(self, action)
¶
Uses the :meth:step
of the :attr:env
that can be overwritten to change the returned data.
Source code in evotorch/neuroevolution/net/rl.py
def step(self, action) > tuple:
step_result = self.env.step(action)
self.__t += 1
observation = step_result[0]
reward = step_result[1]
rest = step_result[2:]
if self.__t >= self.__t1:
reward = reward + self.__bonus
elif (self.__gap is not None) and (self.__t >= self.__t0):
reward = reward + ((self.__t  self.__t0) / self.__gap) * self.__bonus
return (observation, reward) + rest
ObsNormWrapperModule (Module)
¶
Source code in evotorch/neuroevolution/net/rl.py
class ObsNormWrapperModule(nn.Module):
def __init__(self, wrapped_module: nn.Module, rn: Union[RunningStat, RunningNorm]):
super().__init__()
device = device_of_module(wrapped_module)
self.wrapped_module = wrapped_module
with torch.no_grad():
normalizer = deepcopy(rn.to_layer()).to(device)
self.normalizer = normalizer
def forward(self, x: torch.Tensor, h: Any = None) > Union[torch.Tensor, tuple]:
x = self.normalizer(x)
if h is None:
result = self.wrapped_module(x)
else:
result = self.wrapped_module(x, h)
if isinstance(result, tuple):
x, h = result
got_h = True
else:
x = result
h = None
got_h = False
if got_h:
return x, h
else:
return x
forward(self, x, h=None)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/rl.py
def forward(self, x: torch.Tensor, h: Any = None) > Union[torch.Tensor, tuple]:
x = self.normalizer(x)
if h is None:
result = self.wrapped_module(x)
else:
result = self.wrapped_module(x, h)
if isinstance(result, tuple):
x, h = result
got_h = True
else:
x = result
h = None
got_h = False
if got_h:
return x, h
else:
return x
reset_env(env)
¶
Reset a gymnasium environment.
Even though the gymnasium
library switched to a new API where the
reset()
method returns a tuple (observation, info)
, this function
follows the conventions of the classical gym
library and returns
only the observation of the newly reset environment.
Parameters:
Name  Type  Description  Default 

env 
Env 
The gymnasium environment which will be reset. 
required 
Returns:
Type  Description 

Iterable 
The initial observation 
Source code in evotorch/neuroevolution/net/rl.py
def reset_env(env: gym.Env) > Iterable:
"""
Reset a gymnasium environment.
Even though the `gymnasium` library switched to a new API where the
`reset()` method returns a tuple `(observation, info)`, this function
follows the conventions of the classical `gym` library and returns
only the observation of the newly reset environment.
Args:
env: The gymnasium environment which will be reset.
Returns:
The initial observation
"""
result = env.reset()
if isinstance(result, tuple) and (len(result) == 2):
result = result[0]
return result
take_step_in_env(env, action)
¶
Take a step in the gymnasium environment. Taking a step means performing the action provided via the arguments.
Even though the gymnasium
library switched to a new API where the
step()
method returns a 5element tuple of the form
(observation, reward, terminated, truncated, info)
, this function
follows the conventions of the classical gym
library and returns
a 4element tuple (observation, reward, done, info)
.
Parameters:
Name  Type  Description  Default 

env 
Env 
The gymnasium environment in which the action will be performed. 
required 
action 
Iterable 
The action to be performed. 
required 
Returns:
Type  Description 

tuple 
A tuple in the form 
Source code in evotorch/neuroevolution/net/rl.py
def take_step_in_env(env: gym.Env, action: Iterable) > tuple:
"""
Take a step in the gymnasium environment.
Taking a step means performing the action provided via the arguments.
Even though the `gymnasium` library switched to a new API where the
`step()` method returns a 5element tuple of the form
`(observation, reward, terminated, truncated, info)`, this function
follows the conventions of the classical `gym` library and returns
a 4element tuple `(observation, reward, done, info)`.
Args:
env: The gymnasium environment in which the action will be performed.
action: The action to be performed.
Returns:
A tuple in the form `(observation, reward, done, info)` where
`observation` is the observation received after performing the action,
`reward` is the amount of reward gained,
`done` is a boolean value indicating whether or not the episode has
ended, and
`info` is additional information (usually as a dictionary).
"""
result = env.step(action)
if isinstance(result, tuple):
n = len(result)
if n == 4:
observation, reward, done, info = result
elif n == 5:
observation, reward, terminated, truncated, info = result
done = terminated or truncated
else:
raise ValueError(
f"The result of the `step(...)` method of the gym environment"
f" was expected as a tuple of length 4 or 5."
f" However, the received result is {repr(result)}, which is"
f" of length {len(result)}."
)
else:
raise TypeError(
f"The result of the `step(...)` method of the gym environment"
f" was expected as a tuple of length 4 or 5."
f" However, the received result is {repr(result)}, which is"
f" of type {type(result)}."
)
return observation, reward, done, info