evotorch
Top-level package for evotorch.
algorithms
special
¶
This namespace contains the implementations of various evolutionary algorithms.
cmaes
¶
This namespace contains the CMAES class, which is a wrapper
for the CMA-ES implementation of the cma
package.
CMAES (SearchAlgorithm, SinglePopulationAlgorithmMixin)
¶
This is an interface class between the CMAES implementation
within the cma
package developed within the GitHub repository
CMA-ES/pycma.
References:
Nikolaus Hansen, Youhei Akimoto, and Petr Baudis.
CMA-ES/pycma on Github. Zenodo, DOI:10.5281/zenodo.2559634,
February 2019.
<https://github.com/CMA-ES/pycma>
Nikolaus Hansen, Andreas Ostermeier (2001).
Completely Derandomized Self-Adaptation in Evolution Strategies.
Source code in evotorch/algorithms/cmaes.py
class CMAES(SearchAlgorithm, SinglePopulationAlgorithmMixin):
"""
CMAES: Covariance Matrix Adaptation Evolution Strategy.
This is an interface class between the CMAES implementation
within the `cma` package developed within the GitHub repository
CMA-ES/pycma.
References:
Nikolaus Hansen, Youhei Akimoto, and Petr Baudis.
CMA-ES/pycma on Github. Zenodo, DOI:10.5281/zenodo.2559634,
February 2019.
<https://github.com/CMA-ES/pycma>
Nikolaus Hansen, Andreas Ostermeier (2001).
Completely Derandomized Self-Adaptation in Evolution Strategies.
"""
def __init__(
self,
problem: Problem,
*,
stdev_init: RealOrVector, # sigma0
popsize: Optional[int] = None, # popsize
center_init: Optional[Vector] = None, # x0
center_learning_rate: Optional[float] = None, # CMA_cmean
cov_learning_rate: Optional[float] = None, # CMA_on
rankmu_learning_rate: Optional[float] = None, # CMA_rankmu
rankone_learning_rate: Optional[float] = None, # CMA_rankone
stdev_min: Optional[Union[float, np.ndarray]] = None, # minstd
stdev_max: Optional[Union[float, np.ndarray]] = None, # maxstd
separable: bool = False, # CMA_diagonal
obj_index: Optional[int] = None,
cma_options: dict = {},
):
"""
`__init__(...)`: Initialize the CMAES solver.
Args:
problem: The problem object which is being worked on.
stdev_init: Initial standard deviation as a scalar or
as a 1-dimensional array.
popsize: Population size. Can be specified as an int,
or can be left as None to let the CMAES solver
decide the population size according to the length
of a solution.
center_init: Initial center point of the search distribution.
Can be given as a SolutionVector or as a 1-D array.
If left as None, an initial center point is generated
with the help of the problem object's `generate_values(...)`
method.
center_learning_rate: Learning rate for updating the mean
of the search distribution. Leaving this as None
means that the CMAES solver is to use its own default,
which is documented as 1.0.
cov_learning_rate: Learning rate for updating the covariance
matrix of the search distribution. This hyperparameter
acts as a common multiplier for rank_one update and rank_mu
update of the covariance matrix. Leaving this as None
means that the CMAES solver is to use its own default,
which is documented as 1.0.
rankmu_learning_rate: Learning rate for the rank_mu update
of the covariance matrix of the search distribution.
Leaving this as None means that the CMAES solver is to use
its own default, which is documented as 1.0.
rankone_learning_rate: Learning rate for the rank_one update
of the covariance matrix of the search distribution.
Leaving this as None means that the CMAES solver is to use
its own default, which is documented as 1.0.
stdev_min: Minimum allowed standard deviation of the search
distribution. Leaving this as None means that no such
boundary is to be used.
Can be given as None, as a scalar, or as a 1-dimensional
array.
stdev_max: Maximum allowed standard deviation of the search
distribution. Leaving this as None means that no such
boundary is to be used.
Can be given as None, as a scalar, or as a 1-dimensional
array.
separable: Provide this as True if you would like the problem
to be treated as a separable one. Treating a problem
as separable means to adapt only the diagonal parts
of the covariance matrix and to keep the non-diagonal
parts 0. High dimensional problems result in large
covariance matrices on which operating is computationally
expensive. Therefore, for such high dimensional problems,
setting `separable` as True might be useful.
If, instead, you would like to configure on which
iterations the diagonal parts of the covariance matrix
are to be adapted, then it is recommended to leave
`separable` as False and set a new value for the key
"CMA_diagonal" via `cma_options` (see the official
documentation of pycma for details regarding the
"CMA_diagonal" setting).
obj_index: Objective index according to which evaluation
of the solution will be done.
cma_options: Any other configuration for the CMAES solver
can be passed via the cma_options dictionary.
"""
# Make sure that the cma module is installed
if cma is None:
raise ImportError(f"The class {type(self).__name__} is only available if the package `cma` is installed.")
# Initialize the base class
SearchAlgorithm.__init__(self, problem, center=self._get_center)
# Initialize the population.
self._population: SolutionBatch = self._problem.generate_batch(popsize, empty=True)
# Ensure that the problem is numeric
problem.ensure_numeric()
# Store the objective index
self._obj_index = problem.normalize_obj_index(obj_index)
# If `center_init` is not given, generate an initial solution
# with the help of the problem object.
# Otherwise, use the given initial solution as the starting
# point in the search space.
if center_init is None:
x0 = self._problem.generate_values(1).to("cpu").view(-1).numpy().astype(dtype=float)
else:
x0 = numpy_copy(center_init, dtype=float)
# Store the initial standard deviations
sigma0 = numpy_copy(stdev_init, dtype=float)
# Generate an options dictionary to pass to the cma solver.
inopts = {}
for k, v in cma_options.items():
if isinstance(v, torch.Tensor):
v = numpy_copy(v, dtype=float)
inopts[k] = v
# Remove the number of iterations boundary
if "maxiter" not in inopts:
inopts["maxiter"] = np.inf
# Below is a temporary helper function for safely storing the configuration items.
# This inner function updates the `inopts` variable.
def store_opt(key: str, long_name: str, value: Any, converter: Callable):
# Here, `key` represents the configuration key used by pycma
# `long_name` represents the configuration's long name used by this class
# `value` is the configuration value associated with `key`.
# Declare that this inner function accesses the `inopts` variable.
nonlocal inopts
if value is None:
# If the provided `value` is None, then there is no configuration to store.
# So, we just leave this inner function.
return
if key in inopts:
# If the given `key` already exists within `inopts`, this means that the configuration was specified
# twice: via the keyword argument `cma_options` AND via a keyword argument.
# We raise an error and inform the user about this redundancy.
raise ValueError(
f"The configuration {repr(key)} was redundantly provided"
f" both via the initialization argument {long_name}"
f" and via the cma_options dictionary."
f" {long_name}={repr(value)};"
f" cma_options[{repr(key)}]={repr(inopts[key])}."
)
inopts[key] = converter(value)
# Temporary helper function which makes sure that `x` is a numpy array or a float.
def array_or_float(x):
if is_sequence(x):
return numpy_copy(x)
else:
return float(x)
# Store the cma configuration received through the initialization arguments (and raise error if there is
# redundancy with the cma_options dictionary).
store_opt("popsize", "popsize", popsize, int)
store_opt("CMA_cmean", "center_learning_rate", center_learning_rate, float)
store_opt("CMA_on", "cov_learning_rate", cov_learning_rate, float)
store_opt("CMA_rankmu", "rankmu_learning_rate", rankmu_learning_rate, float)
store_opt("CMA_rankone", "rankone_learning_rate", rankone_learning_rate, float)
store_opt("minstd", "stdev_min", stdev_min, array_or_float)
store_opt("maxstd", "stdev_max", stdev_max, array_or_float)
if separable:
store_opt("CMA_diagonal", "separable", separable, bool)
# If the problem defines lower and upper bounds, pass these into the options dict.
def process_bounds(bounds: RealOrVector) -> np.ndarray:
if bounds is None:
return None
else:
if is_sequence(bounds):
bounds = numpy_copy(bounds)
else:
bounds = np.array(float(bounds)).repeat(self._problem.solution_length)
return bounds
lb = process_bounds(self._problem.lower_bounds)
ub = process_bounds(self._problem.upper_bounds)
register_bounds = False
if lb is not None and ub is None:
ub = np.array(np.inf).repeat(self._problem.solution_length)
register_bounds = True
elif lb is None and ub is not None:
lb = np.array(-(np.inf)).repeat(self._problem.solution_length)
register_bounds = True
elif lb is not None and ub is not None:
register_bounds = True
if register_bounds:
inopts["bounds"] = [lb, ub]
# Generate a random seed using the problem object for the sake of reproducibility.
if "seed" not in inopts:
inopts["seed"] = int(self._problem.make_randint(tuple(), n=(2**32) - 100) + 100)
# Instantiate the CMAEvolutionStrategy with the prepared configuration items.
self._es = cma.CMAEvolutionStrategy(x0, sigma0, inopts)
# Use the SinglePopulationAlgorithmMixin to enable additional status reports regarding the population.
SinglePopulationAlgorithmMixin.__init__(self)
@property
def population(self) -> SolutionBatch:
"""Population generated by the CMA-ES algorithm"""
return self._population
def _step(self):
"""Perform a step of the CMA-ES solver"""
asked = self._es.ask()
self._population.access_values()[:] = torch.as_tensor(
np.asarray(asked), dtype=self._problem.dtype, device=self._population.device
)
self._problem.evaluate(self._population)
scores = numpy_copy(self._population.utility(self._obj_index), dtype=float)
self._es.tell(asked, -1.0 * scores)
def _get_center(self) -> torch.Tensor:
return torch.as_tensor(self._es.result[5], dtype=self._population.dtype, device=self._population.device)
@property
def obj_index(self) -> int:
"""Index of the objective being focused on"""
return self._obj_index
obj_index: int
property
readonly
¶
Index of the objective being focused on
population: SolutionBatch
property
readonly
¶
Population generated by the CMA-ES algorithm
__init__(self, problem, *, stdev_init, popsize=None, center_init=None, center_learning_rate=None, cov_learning_rate=None, rankmu_learning_rate=None, rankone_learning_rate=None, stdev_min=None, stdev_max=None, separable=False, obj_index=None, cma_options={})
special
¶
__init__(...)
: Initialize the CMAES solver.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
stdev_init |
Union[float, Iterable[float], torch.Tensor] |
Initial standard deviation as a scalar or as a 1-dimensional array. |
required |
popsize |
Optional[int] |
Population size. Can be specified as an int, or can be left as None to let the CMAES solver decide the population size according to the length of a solution. |
None |
center_init |
Union[Iterable[float], torch.Tensor] |
Initial center point of the search distribution.
Can be given as a SolutionVector or as a 1-D array.
If left as None, an initial center point is generated
with the help of the problem object's |
None |
center_learning_rate |
Optional[float] |
Learning rate for updating the mean of the search distribution. Leaving this as None means that the CMAES solver is to use its own default, which is documented as 1.0. |
None |
cov_learning_rate |
Optional[float] |
Learning rate for updating the covariance matrix of the search distribution. This hyperparameter acts as a common multiplier for rank_one update and rank_mu update of the covariance matrix. Leaving this as None means that the CMAES solver is to use its own default, which is documented as 1.0. |
None |
rankmu_learning_rate |
Optional[float] |
Learning rate for the rank_mu update of the covariance matrix of the search distribution. Leaving this as None means that the CMAES solver is to use its own default, which is documented as 1.0. |
None |
rankone_learning_rate |
Optional[float] |
Learning rate for the rank_one update of the covariance matrix of the search distribution. Leaving this as None means that the CMAES solver is to use its own default, which is documented as 1.0. |
None |
stdev_min |
Union[float, numpy.ndarray] |
Minimum allowed standard deviation of the search distribution. Leaving this as None means that no such boundary is to be used. Can be given as None, as a scalar, or as a 1-dimensional array. |
None |
stdev_max |
Union[float, numpy.ndarray] |
Maximum allowed standard deviation of the search distribution. Leaving this as None means that no such boundary is to be used. Can be given as None, as a scalar, or as a 1-dimensional array. |
None |
separable |
bool |
Provide this as True if you would like the problem
to be treated as a separable one. Treating a problem
as separable means to adapt only the diagonal parts
of the covariance matrix and to keep the non-diagonal
parts 0. High dimensional problems result in large
covariance matrices on which operating is computationally
expensive. Therefore, for such high dimensional problems,
setting |
False |
obj_index |
Optional[int] |
Objective index according to which evaluation of the solution will be done. |
None |
cma_options |
dict |
Any other configuration for the CMAES solver can be passed via the cma_options dictionary. |
{} |
Source code in evotorch/algorithms/cmaes.py
def __init__(
self,
problem: Problem,
*,
stdev_init: RealOrVector, # sigma0
popsize: Optional[int] = None, # popsize
center_init: Optional[Vector] = None, # x0
center_learning_rate: Optional[float] = None, # CMA_cmean
cov_learning_rate: Optional[float] = None, # CMA_on
rankmu_learning_rate: Optional[float] = None, # CMA_rankmu
rankone_learning_rate: Optional[float] = None, # CMA_rankone
stdev_min: Optional[Union[float, np.ndarray]] = None, # minstd
stdev_max: Optional[Union[float, np.ndarray]] = None, # maxstd
separable: bool = False, # CMA_diagonal
obj_index: Optional[int] = None,
cma_options: dict = {},
):
"""
`__init__(...)`: Initialize the CMAES solver.
Args:
problem: The problem object which is being worked on.
stdev_init: Initial standard deviation as a scalar or
as a 1-dimensional array.
popsize: Population size. Can be specified as an int,
or can be left as None to let the CMAES solver
decide the population size according to the length
of a solution.
center_init: Initial center point of the search distribution.
Can be given as a SolutionVector or as a 1-D array.
If left as None, an initial center point is generated
with the help of the problem object's `generate_values(...)`
method.
center_learning_rate: Learning rate for updating the mean
of the search distribution. Leaving this as None
means that the CMAES solver is to use its own default,
which is documented as 1.0.
cov_learning_rate: Learning rate for updating the covariance
matrix of the search distribution. This hyperparameter
acts as a common multiplier for rank_one update and rank_mu
update of the covariance matrix. Leaving this as None
means that the CMAES solver is to use its own default,
which is documented as 1.0.
rankmu_learning_rate: Learning rate for the rank_mu update
of the covariance matrix of the search distribution.
Leaving this as None means that the CMAES solver is to use
its own default, which is documented as 1.0.
rankone_learning_rate: Learning rate for the rank_one update
of the covariance matrix of the search distribution.
Leaving this as None means that the CMAES solver is to use
its own default, which is documented as 1.0.
stdev_min: Minimum allowed standard deviation of the search
distribution. Leaving this as None means that no such
boundary is to be used.
Can be given as None, as a scalar, or as a 1-dimensional
array.
stdev_max: Maximum allowed standard deviation of the search
distribution. Leaving this as None means that no such
boundary is to be used.
Can be given as None, as a scalar, or as a 1-dimensional
array.
separable: Provide this as True if you would like the problem
to be treated as a separable one. Treating a problem
as separable means to adapt only the diagonal parts
of the covariance matrix and to keep the non-diagonal
parts 0. High dimensional problems result in large
covariance matrices on which operating is computationally
expensive. Therefore, for such high dimensional problems,
setting `separable` as True might be useful.
If, instead, you would like to configure on which
iterations the diagonal parts of the covariance matrix
are to be adapted, then it is recommended to leave
`separable` as False and set a new value for the key
"CMA_diagonal" via `cma_options` (see the official
documentation of pycma for details regarding the
"CMA_diagonal" setting).
obj_index: Objective index according to which evaluation
of the solution will be done.
cma_options: Any other configuration for the CMAES solver
can be passed via the cma_options dictionary.
"""
# Make sure that the cma module is installed
if cma is None:
raise ImportError(f"The class {type(self).__name__} is only available if the package `cma` is installed.")
# Initialize the base class
SearchAlgorithm.__init__(self, problem, center=self._get_center)
# Initialize the population.
self._population: SolutionBatch = self._problem.generate_batch(popsize, empty=True)
# Ensure that the problem is numeric
problem.ensure_numeric()
# Store the objective index
self._obj_index = problem.normalize_obj_index(obj_index)
# If `center_init` is not given, generate an initial solution
# with the help of the problem object.
# Otherwise, use the given initial solution as the starting
# point in the search space.
if center_init is None:
x0 = self._problem.generate_values(1).to("cpu").view(-1).numpy().astype(dtype=float)
else:
x0 = numpy_copy(center_init, dtype=float)
# Store the initial standard deviations
sigma0 = numpy_copy(stdev_init, dtype=float)
# Generate an options dictionary to pass to the cma solver.
inopts = {}
for k, v in cma_options.items():
if isinstance(v, torch.Tensor):
v = numpy_copy(v, dtype=float)
inopts[k] = v
# Remove the number of iterations boundary
if "maxiter" not in inopts:
inopts["maxiter"] = np.inf
# Below is a temporary helper function for safely storing the configuration items.
# This inner function updates the `inopts` variable.
def store_opt(key: str, long_name: str, value: Any, converter: Callable):
# Here, `key` represents the configuration key used by pycma
# `long_name` represents the configuration's long name used by this class
# `value` is the configuration value associated with `key`.
# Declare that this inner function accesses the `inopts` variable.
nonlocal inopts
if value is None:
# If the provided `value` is None, then there is no configuration to store.
# So, we just leave this inner function.
return
if key in inopts:
# If the given `key` already exists within `inopts`, this means that the configuration was specified
# twice: via the keyword argument `cma_options` AND via a keyword argument.
# We raise an error and inform the user about this redundancy.
raise ValueError(
f"The configuration {repr(key)} was redundantly provided"
f" both via the initialization argument {long_name}"
f" and via the cma_options dictionary."
f" {long_name}={repr(value)};"
f" cma_options[{repr(key)}]={repr(inopts[key])}."
)
inopts[key] = converter(value)
# Temporary helper function which makes sure that `x` is a numpy array or a float.
def array_or_float(x):
if is_sequence(x):
return numpy_copy(x)
else:
return float(x)
# Store the cma configuration received through the initialization arguments (and raise error if there is
# redundancy with the cma_options dictionary).
store_opt("popsize", "popsize", popsize, int)
store_opt("CMA_cmean", "center_learning_rate", center_learning_rate, float)
store_opt("CMA_on", "cov_learning_rate", cov_learning_rate, float)
store_opt("CMA_rankmu", "rankmu_learning_rate", rankmu_learning_rate, float)
store_opt("CMA_rankone", "rankone_learning_rate", rankone_learning_rate, float)
store_opt("minstd", "stdev_min", stdev_min, array_or_float)
store_opt("maxstd", "stdev_max", stdev_max, array_or_float)
if separable:
store_opt("CMA_diagonal", "separable", separable, bool)
# If the problem defines lower and upper bounds, pass these into the options dict.
def process_bounds(bounds: RealOrVector) -> np.ndarray:
if bounds is None:
return None
else:
if is_sequence(bounds):
bounds = numpy_copy(bounds)
else:
bounds = np.array(float(bounds)).repeat(self._problem.solution_length)
return bounds
lb = process_bounds(self._problem.lower_bounds)
ub = process_bounds(self._problem.upper_bounds)
register_bounds = False
if lb is not None and ub is None:
ub = np.array(np.inf).repeat(self._problem.solution_length)
register_bounds = True
elif lb is None and ub is not None:
lb = np.array(-(np.inf)).repeat(self._problem.solution_length)
register_bounds = True
elif lb is not None and ub is not None:
register_bounds = True
if register_bounds:
inopts["bounds"] = [lb, ub]
# Generate a random seed using the problem object for the sake of reproducibility.
if "seed" not in inopts:
inopts["seed"] = int(self._problem.make_randint(tuple(), n=(2**32) - 100) + 100)
# Instantiate the CMAEvolutionStrategy with the prepared configuration items.
self._es = cma.CMAEvolutionStrategy(x0, sigma0, inopts)
# Use the SinglePopulationAlgorithmMixin to enable additional status reports regarding the population.
SinglePopulationAlgorithmMixin.__init__(self)
distributed
special
¶
gaussian
¶
CEM (GaussianSearchAlgorithm)
¶
The cross-entropy method (CEM) (Rubinstein, 1999).
This CEM implementation is focused on continuous optimization, and follows the variant explained in Duan et al. (2016).
The adaptive population size mechanism explained in Toklu et al. (2020)
(and previously used in the accompanying source code of the study
Salimans et al. (2017)) is supported, where the population size in an
iteration keeps increasing until a certain numberof interactions with
the simulator of the reinforcement learning environment is made.
See the initialization arguments num_interactions
, popsize_max
.
References:
Rubinstein, R. (1999). The cross-entropy method for combinatorial
and continuous optimization.
Methodology and computing in applied probability, 1(2), 127-190.
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P. (2016).
Benchmarking deep reinforcement learning for continuous control.
International conference on machine learning. PMLR, 2016.
Salimans, T., Ho, J., Chen, X., Sidor, S. and Sutskever, I. (2017).
Evolution Strategies as a Scalable Alternative to
Reinforcement Learning.
Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
ClipUp: A Simple and Powerful Optimizer
for Distribution-based Policy Evolution.
Parallel Problem Solving from Nature (PPSN 2020).
Source code in evotorch/algorithms/distributed/gaussian.py
class CEM(GaussianSearchAlgorithm):
"""
The cross-entropy method (CEM) (Rubinstein, 1999).
This CEM implementation is focused on continuous optimization,
and follows the variant explained in Duan et al. (2016).
The adaptive population size mechanism explained in Toklu et al. (2020)
(and previously used in the accompanying source code of the study
Salimans et al. (2017)) is supported, where the population size in an
iteration keeps increasing until a certain numberof interactions with
the simulator of the reinforcement learning environment is made.
See the initialization arguments `num_interactions`, `popsize_max`.
References:
Rubinstein, R. (1999). The cross-entropy method for combinatorial
and continuous optimization.
Methodology and computing in applied probability, 1(2), 127-190.
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P. (2016).
Benchmarking deep reinforcement learning for continuous control.
International conference on machine learning. PMLR, 2016.
Salimans, T., Ho, J., Chen, X., Sidor, S. and Sutskever, I. (2017).
Evolution Strategies as a Scalable Alternative to
Reinforcement Learning.
Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
ClipUp: A Simple and Powerful Optimizer
for Distribution-based Policy Evolution.
Parallel Problem Solving from Nature (PPSN 2020).
"""
DISTRIBUTION_TYPE = SeparableGaussian
DISTRIBUTION_PARAMS = NotImplemented # To be filled by the CEM instance
def __init__(
self,
problem: Problem,
*,
popsize: int,
parenthood_ratio: float,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[Union[float, RealOrVector]] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the search algorithm.
Args:
problem: The problem object to work on.
popsize: The population size.
parenthood_ratio: Expected as a float larger than 0 and smaller
than 1. For example, setting this value to 0.1 means that
the top 10% of the population will be declared as the parents,
and those parents will be used for updating the population.
The amount of parents is always computed according to the
specified `popsize`, not according to the adapted population
size, and not according to `popsize_max`.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
center_init: The initial center solution.
Can be left as None.
stdev_min: The minimum value for the standard deviation
values of the Gaussian search distribution.
Can be left as None (which is the default),
or can be given as a scalar or as a 1-dimensional array.
stdev_max: The maximum value for the standard deviation
values of the Gaussian search distribution.
Can be left as None (which is the default),
or can be given as a scalar or as a 1-dimensional array.
stdev_max_change: The maximum update ratio allowed on the
standard deviation. Expected as None if no such limiter
is needed, or as a real number within 0.0 and 1.0 otherwise.
In the PGPE implementation of Ha (2017, 2018), a value of
0.2 (20%) was used.
For this CEM implementation, the default is None.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
self.DISTRIBUTION_PARAMS = {"parenthood_ratio": float(parenthood_ratio)}
super().__init__(
problem,
popsize=popsize,
center_learning_rate=1.0,
stdev_learning_rate=1.0,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=None,
optimizer_config=None,
ranking_method=None,
center_init=center_init,
stdev_min=stdev_min,
stdev_max=stdev_max,
stdev_max_change=stdev_max_change,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
)
DISTRIBUTION_TYPE (Distribution)
¶
Separable Multivariate Gaussian, as used by PGPE
Source code in evotorch/algorithms/distributed/gaussian.py
class SeparableGaussian(Distribution):
"""Separable Multivariate Gaussian, as used by PGPE"""
MANDATORY_PARAMETERS = {"mu", "sigma"}
OPTIONAL_PARAMETERS = {"divide_mu_grad_by", "divide_sigma_grad_by", "parenthood_ratio"}
def __init__(
self,
parameters: dict,
*,
solution_length: Optional[int] = None,
device: Optional[Device] = None,
dtype: Optional[DType] = None,
):
[mu_length] = parameters["mu"].shape
[sigma_length] = parameters["sigma"].shape
if solution_length is None:
solution_length = mu_length
else:
if solution_length != mu_length:
raise ValueError(
f"The argument `solution_length` does not match the length of `mu` provided in `parameters`."
f" solution_length={solution_length},"
f' parameters["mu"]={mu_length}.'
)
if mu_length != sigma_length:
raise ValueError(
f"The tensors `mu` and `sigma` provided within `parameters` have mismatching lengths."
f' parameters["mu"]={mu_length},'
f' parameters["sigma"]={sigma_length}.'
)
super().__init__(
solution_length=solution_length,
parameters=parameters,
device=device,
dtype=dtype,
)
@property
def mu(self) -> torch.Tensor:
return self.parameters["mu"]
@mu.setter
def mu(self, new_mu: Iterable):
self.parameters["mu"] = torch.as_tensor(new_mu, dtype=self.dtype, device=self.device)
@property
def sigma(self) -> torch.Tensor:
return self.parameters["sigma"]
@sigma.setter
def sigma(self, new_sigma: Iterable):
self.parameters["sigma"] = torch.as_tensor(new_sigma, dtype=self.dtype, device=self.device)
def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
self.make_gaussian(out=out, center=self.mu, stdev=self.sigma, generator=generator)
def _divide_grad(self, param_name: str, grad: torch.Tensor, weights: torch.Tensor) -> torch.Tensor:
option = f"divide_{param_name}_grad_by"
if option in self.parameters:
div_by_what = self.parameters[option]
if div_by_what == "num_solutions":
[num_solutions] = weights.shape
grad = grad / num_solutions
elif div_by_what == "num_directions":
[num_solutions] = weights.shape
num_directions = num_solutions // 2
grad = grad / num_directions
elif div_by_what == "total_weight":
total_weight = torch.sum(torch.abs(weights))
grad = grad / total_weight
elif div_by_what == "weight_stdev":
weight_stdev = torch.std(weights)
grad = grad / weight_stdev
else:
raise ValueError(f"The parameter {option} has an unrecognized value: {div_by_what}")
return grad
def _compute_gradients_via_parenthood_ratio(self, samples: torch.Tensor, weights: torch.Tensor) -> dict:
[num_samples, _] = samples.shape
num_elites = math.floor(num_samples * self.parameters["parenthood_ratio"])
elite_indices = weights.argsort(descending=True)[:num_elites]
elites = samples[elite_indices, :]
return {
"mu": torch.mean(elites, dim=0) - self.parameters["mu"],
"sigma": torch.std(elites, dim=0) - self.parameters["sigma"],
}
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
if "parenthood_ratio" in self.parameters:
return self._compute_gradients_via_parenthood_ratio(samples, weights)
else:
mu = self.mu
sigma = self.sigma
# Compute the scaled noises, that is, the noise vectors which
# were used for generating the solutions
# (solution = scaled_noise + center)
scaled_noises = samples - mu
# Make sure that the weights (utilities) are 0-centered
# (Otherwise the formulations would have to consider a bias term)
if ranking_used not in ("centered", "normalized"):
weights = weights - torch.mean(weights)
mu_grad = self._divide_grad(
"mu",
total(dot(weights, scaled_noises)),
weights,
)
sigma_grad = self._divide_grad(
"sigma",
total(dot(weights, ((scaled_noises**2) - (sigma**2)) / sigma)),
weights,
)
return {
"mu": mu_grad,
"sigma": sigma_grad,
}
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "SeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma + self._follow_gradient(
"sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
def relative_entropy(dist_0: "SeparableGaussian", dist_1: "SeparableGaussian") -> float:
mu_0 = dist_0.parameters["mu"]
mu_1 = dist_1.parameters["mu"]
sigma_0 = dist_0.parameters["sigma"]
sigma_1 = dist_1.parameters["sigma"]
cov_0 = sigma_0.pow(2.0)
cov_1 = sigma_1.pow(2.0)
mu_delta = mu_1 - mu_0
trace_cov = torch.sum(cov_0 / cov_1)
k = dist_0.solution_length
scaled_mu = torch.sum(mu_delta.pow(2.0) / cov_1)
log_det = torch.sum(torch.log(cov_1)) - torch.sum(torch.log(cov_0))
return 0.5 * (trace_cov - k + scaled_mu + log_det)
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
SeparableGaussian |
The updated copy of the distribution. |
Source code in evotorch/algorithms/distributed/gaussian.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "SeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma + self._follow_gradient(
"sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
__init__(self, problem, *, popsize, parenthood_ratio, stdev_init=None, radius_init=None, num_interactions=None, popsize_max=None, center_init=None, stdev_min=None, stdev_max=None, stdev_max_change=None, obj_index=None, distributed=False, popsize_weighted_grad_avg=None)
special
¶
__init__(...)
: Initialize the search algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object to work on. |
required |
popsize |
int |
The population size. |
required |
parenthood_ratio |
float |
Expected as a float larger than 0 and smaller
than 1. For example, setting this value to 0.1 means that
the top 10% of the population will be declared as the parents,
and those parents will be used for updating the population.
The amount of parents is always computed according to the
specified |
required |
stdev_init |
Union[float, Iterable[float], torch.Tensor] |
The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
radius_init |
Union[float, Iterable[float], torch.Tensor] |
The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
num_interactions |
Optional[int] |
When given as an integer n, it is ensured that a population has interacted with the GymProblem's environment n times. If this target has not been reached yet, then the population is declared too small, and gets extended with more samples, until n amount of interactions is reached. When given as None, popsize is the only configuration affecting the size of a population. |
None |
popsize_max |
Optional[int] |
Having |
None |
center_init |
Union[float, Iterable[float], torch.Tensor] |
The initial center solution. Can be left as None. |
None |
stdev_min |
Union[float, Iterable[float], torch.Tensor] |
The minimum value for the standard deviation values of the Gaussian search distribution. Can be left as None (which is the default), or can be given as a scalar or as a 1-dimensional array. |
None |
stdev_max |
Union[float, Iterable[float], torch.Tensor] |
The maximum value for the standard deviation values of the Gaussian search distribution. Can be left as None (which is the default), or can be given as a scalar or as a 1-dimensional array. |
None |
stdev_max_change |
Union[float, Iterable[float], torch.Tensor] |
The maximum update ratio allowed on the standard deviation. Expected as None if no such limiter is needed, or as a real number within 0.0 and 1.0 otherwise. In the PGPE implementation of Ha (2017, 2018), a value of 0.2 (20%) was used. For this CEM implementation, the default is None. |
None |
obj_index |
Optional[int] |
Index of the objective according to which the gradient estimations will be done. For single-objective problems, this can be left as None. |
None |
distributed |
bool |
Whether or not the gradient computation will
be distributed. If |
False |
popsize_weighted_grad_avg |
Optional[bool] |
Only to be used in distributed mode.
(where being in distributed mode means |
None |
Source code in evotorch/algorithms/distributed/gaussian.py
def __init__(
self,
problem: Problem,
*,
popsize: int,
parenthood_ratio: float,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[Union[float, RealOrVector]] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the search algorithm.
Args:
problem: The problem object to work on.
popsize: The population size.
parenthood_ratio: Expected as a float larger than 0 and smaller
than 1. For example, setting this value to 0.1 means that
the top 10% of the population will be declared as the parents,
and those parents will be used for updating the population.
The amount of parents is always computed according to the
specified `popsize`, not according to the adapted population
size, and not according to `popsize_max`.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
center_init: The initial center solution.
Can be left as None.
stdev_min: The minimum value for the standard deviation
values of the Gaussian search distribution.
Can be left as None (which is the default),
or can be given as a scalar or as a 1-dimensional array.
stdev_max: The maximum value for the standard deviation
values of the Gaussian search distribution.
Can be left as None (which is the default),
or can be given as a scalar or as a 1-dimensional array.
stdev_max_change: The maximum update ratio allowed on the
standard deviation. Expected as None if no such limiter
is needed, or as a real number within 0.0 and 1.0 otherwise.
In the PGPE implementation of Ha (2017, 2018), a value of
0.2 (20%) was used.
For this CEM implementation, the default is None.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
self.DISTRIBUTION_PARAMS = {"parenthood_ratio": float(parenthood_ratio)}
super().__init__(
problem,
popsize=popsize,
center_learning_rate=1.0,
stdev_learning_rate=1.0,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=None,
optimizer_config=None,
ranking_method=None,
center_init=center_init,
stdev_min=stdev_min,
stdev_max=stdev_max,
stdev_max_change=stdev_max_change,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
)
GaussianSearchAlgorithm (SearchAlgorithm, SinglePopulationAlgorithmMixin)
¶
Base class for search algorithms based on Gaussian distribution.
Source code in evotorch/algorithms/distributed/gaussian.py
class GaussianSearchAlgorithm(SearchAlgorithm, SinglePopulationAlgorithmMixin):
"""
Base class for search algorithms based on Gaussian distribution.
"""
DISTRIBUTION_TYPE = NotImplemented
DISTRIBUTION_PARAMS = NotImplemented
def __init__(
self,
problem: Problem,
*,
popsize: int,
center_learning_rate: float,
stdev_learning_rate: float,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer=None,
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = None,
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[RealOrVector] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
ensure_even_popsize: bool = False,
):
# Ensure that the problem is numeric
problem.ensure_numeric()
# The distribution-based algorithms we consider here cannot handle strict lower and upper bound constraints.
# Therefore, we ensure that the given problem is unbounded.
problem.ensure_unbounded()
# Initialize the SearchAlgorithm, which is the parent class
SearchAlgorithm.__init__(
self,
problem,
center=self._get_mu,
stdev=self._get_sigma,
mean_eval=self._get_mean_eval,
)
self._ensure_even_popsize = bool(ensure_even_popsize)
if not distributed:
# self.add_status_getters({"median_eval": self._get_median_eval})
if num_interactions is not None:
self.add_status_getters({"popsize": self._get_popsize})
if self._ensure_even_popsize:
if (popsize % 2) != 0:
raise ValueError(
f"`popsize` was expected as an even number. However, the received `popsize` is {popsize}."
)
if center_init is None:
# If a starting point for the search distribution is not given,
# then we use the problem object to generate us one.
mu = problem.generate_values(1).reshape(-1)
else:
# If a starting point for the search distribution is given,
# then we make sure that its length, dtype, and device
# are correct.
mu = problem.ensure_tensor_length_and_dtype(center_init, allow_scalar=False, about="center_init")
# Get the standard deviation or the radius configuration from the arguments
stdev_init = to_stdev_init(
solution_length=problem.solution_length, stdev_init=stdev_init, radius_init=radius_init
)
# Make sure that the provided initial standard deviation is
# of correct length, dtype, and device.
sigma = problem.ensure_tensor_length_and_dtype(stdev_init, about="stdev_init", allow_scalar=False)
# Create the distribution
dist_cls = self.DISTRIBUTION_TYPE
dist_params = deepcopy(self.DISTRIBUTION_PARAMS) if self.DISTRIBUTION_PARAMS is not None else {}
dist_params.update({"mu": mu, "sigma": sigma})
self._distribution: Distribution = dist_cls(dist_params, dtype=problem.dtype, device=problem.device)
# Store the following keyword arguments to use later
self._popsize = int(popsize)
self._popsize_max = None if popsize_max is None else int(popsize_max)
self._num_interactions = None if num_interactions is None else int(num_interactions)
self._center_learning_rate = float(center_learning_rate)
self._stdev_learning_rate = float(stdev_learning_rate)
self._optimizer = self._initialize_optimizer(self._center_learning_rate, optimizer, optimizer_config)
self._ranking_method = None if ranking_method is None else str(ranking_method)
self._stdev_min = (
None
if stdev_min is None
else problem.ensure_tensor_length_and_dtype(stdev_min, about="stdev_min", allow_scalar=True)
)
self._stdev_max = (
None
if stdev_max is None
else problem.ensure_tensor_length_and_dtype(stdev_max, about="stdev_max", allow_scalar=True)
)
self._stdev_max_change = (
None
if stdev_max_change is None
else problem.ensure_tensor_length_and_dtype(stdev_max_change, about="stdev_max_change", allow_scalar=True)
)
self._obj_index = problem.normalize_obj_index(obj_index)
if distributed and (problem.num_actors > 0):
# If the algorithm is initialized in distributed mode, and also if the problem is configured
# for parallelization, then the _step method becomes an alias for _step_distributed
self._step = self._step_distributed
else:
# Otherwise, the _step method becomes an alias for _step_non_distributed
self._step = self._step_non_distributed
if popsize_weighted_grad_avg is None:
self._popsize_weighted_grad_avg = num_interactions is None
else:
if not distributed:
raise ValueError(
"The argument `popsize_weighted_grad_avg` can only be used in distributed mode."
" (i.e. when the argument `distributed` is given as True)."
" When `distributed` is False, please leave `popsize_weighted_grad_avg` as None."
)
self._popsize_weighted_grad_avg = bool(popsize_weighted_grad_avg)
self._mean_eval: Optional[float] = None
self._population: Optional[SolutionBatch] = None
self._first_iter: bool = True
# We would like to add the reporting capabilities of the mixin class `singlePopulationAlgorithmMixin`.
# However, we exclude "mean_eval" from the reporting services requested from `SinglePopulationAlgorithmMixin`
# because this class has its own reporting mechanism for `mean_eval`.
# Additionally, we enable the reporting services of `SinglePopulationAlgorithmMixin` only when we are
# in the non-distributed mode. This is because we do not have a centrally stored population at all in the
# distributed mode.
SinglePopulationAlgorithmMixin.__init__(self, exclude="mean_eval", enable=(not distributed))
def _initialize_optimizer(
self, learning_rate: float, optimizer=None, optimizer_config: Optional[dict] = None
) -> object:
if optimizer is None:
return None
elif isinstance(optimizer, str):
center_optim_cls = get_optimizer_class(optimizer, optimizer_config)
return center_optim_cls(
stepsize=float(learning_rate),
dtype=self._distribution.dtype,
solution_length=self._distribution.solution_length,
device=self._distribution.device,
)
else:
return optimizer
def _step(self):
raise NotImplementedError
def _step_distributed(self):
# Use the problem object's `sample_and_compute_gradients` method
# to do parallelized and distributed gradient computation
fetched = self.problem.sample_and_compute_gradients(
self._distribution,
self._popsize,
popsize_max=self._popsize_max,
obj_index=self._obj_index,
num_interactions=self._num_interactions,
ranking_method=self._ranking_method,
ensure_even_popsize=self._ensure_even_popsize,
)
# The method `sample_and_compute_gradients(...)` returns a list of dictionaries, each dictionary being
# the result of a different remote computation.
# For each remote computation, the list will contain a dictionary that looks like this:
# {"gradients": <gradients dictionary here>, "num_solutions": ..., "mean_eval": ...}
# We will now accumulate all the gradients, num_solutions, and mean_evals in their own lists.
# So, in the end, we will have a list of gradients, a list of num_solutions, and a list of
# mean_eval.
# These lists will be stored by the following temporary class:
class list_of:
gradients = []
num_solutions = []
mean_eval = []
# We are now filling the lists declared above
n = len(fetched)
for i in range(n):
list_of.gradients.append(fetched[i]["gradients"])
list_of.num_solutions.append(fetched[i]["num_solutions"])
list_of.mean_eval.append(fetched[i]["mean_eval"])
# Here, we get the keys of our gradient dictionaries.
# For most simple Gaussian distributions, grad_keys should be {"mu", "sigma"}.
grad_keys = set(list_of.gradients[0].keys())
# We now find the total number of solutions and the overall average mean_eval.
# The overall average mean will be reported to the user.
total_num_solutions = 0
total_weighted_eval = 0
for i in range(n):
total_num_solutions += list_of.num_solutions[i]
total_weighted_eval += float(list_of.num_solutions[i] * list_of.mean_eval[i])
avg_mean_eval = total_weighted_eval / total_num_solutions
# For each gradient (in most cases among 'mu' and 'sigma'), we allocate a new 0-filled tensor.
avg_gradients = {}
for key in grad_keys:
avg_gradients[key] = self._distribution.make_zeros(num_solutions=1).reshape(-1)
# Below, we iterate over all collected results and add their gradients, in a weighted manner, onto the
# `avg_gradients` we allocated above.
# At the end, `avg_gradients` will store the weighted-averaged gradients to be followed by the algorithm.
for i in range(n):
# For each collected result, we compute a weight for the gradient, which is the number of solutions
# sampled divided by the total number of sampled solutions.
num_solutions = list_of.num_solutions[i]
if self._popsize_weighted_grad_avg:
# If we are to weigh each gradient by its popsize (i.e. by its sample size)
# then the its weight is computed as its number of solutions divided by the
# total number of solutions
weight = num_solutions / total_num_solutions
else:
# If we are NOT to weigh each gradient by its popsize (i.e. by its sample size)
# then the weight of this gradient simply becomes 1 divided by the number of gradients.
weight = 1 / n
for key in grad_keys:
grad = list_of.gradients[i][key]
avg_gradients[key] += weight * grad
self._update_distribution(avg_gradients)
self._mean_eval = avg_mean_eval
def _step_non_distributed(self):
# First, we define an inner function which fills the current population by sampling from the distribution.
def fill_and_eval_pop():
# This inner function is responsible for filling the main population with samples
# and evaluate them.
if self._num_interactions is None:
# If num_interactions is configured as None, this means that we are not going to adapt
# the population size according to the number of simulation interactions reported
# by the problem object.
# We first make sure that the population (which is to be of constant size, since we are
# not in the adaptive population size mode) is allocated.
if self._population is None:
self._population = SolutionBatch(
self.problem, popsize=self._popsize, device=self._distribution.device, empty=True
)
# Now, we do in-place sampling on the population.
self._distribution.sample(out=self._population.access_values(), generator=self.problem)
# Finally, here, the solutions are evaluated.
self.problem.evaluate(self._population)
else:
# If num_interactions is not None, then this means that we have a threshold for the number
# of simulator interactions to reach before declaring the phase of sampling complete.
# In other words, we have to adapt our population size according to the number of simulator
# interactions reported by the problem object.
# The 'total_interaction_count' status reported by the problem object shows the global interaction count.
# Therefore, to properly count the simulator interactions we made during this generation, we need
# to get the interaction count before starting our sampling and evaluation operations.
first_num_interactions = self.problem.status.get("total_interaction_count", 0)
# We will keep allocating and evaluating new populations until the interaction count threshold is reached.
# These newly allocated populations will eventually concatenated into one.
# The not-yet-concatenated populations and the total allocated population size will be stored below:
populations = []
total_popsize = 0
# Below, we repeatedly allocate, sample, and evaluate, until our thresholds are reached.
while True:
# Allocate a new population
newpop = SolutionBatch(
self.problem,
popsize=self._popsize,
like=self._population,
empty=True,
)
# Update the total population size
total_popsize += len(newpop)
# Sample new solutions within the newly allocated population
self._distribution.sample(out=newpop.access_values(), generator=self.problem)
# Evaluate the new population
self.problem.evaluate(newpop)
# Add the newly allocated and evaluated population into the populations list
populations.append(newpop)
# In addition to the num_interactions threshold, we might also have a popsize_max threshold.
# We now check this threshold.
if (self._popsize_max is not None) and (total_popsize >= self._popsize_max):
# If the popsize_max threshold is reached, we leave the loop.
break
# We now compute the number of interactions we have made during this while loop.
interactions_made = self.problem.status["total_interaction_count"] - first_num_interactions
if interactions_made > self._num_interactions:
# If the number of interactions exceeds our threshold, we leave the loop.
break
# Finally, we concatenate all our populations into one.
self._population = SolutionBatch.cat(populations)
if self._first_iter:
# If we are computing the first generation, we just sample from our distribution and evaluate
# the solutions.
fill_and_eval_pop()
self._first_iter = False
else:
# If we are computing next generations, then we need to compute the gradients of the last
# generation, sample a new population, and evaluate the new population's solutions.
samples = self._population.access_values(keep_evals=True)
fitnesses = self._population.access_evals()[:, self._obj_index]
obj_sense = self.problem.senses[self._obj_index]
ranking_method = self._ranking_method
gradients = self._distribution.compute_gradients(
samples, fitnesses, objective_sense=obj_sense, ranking_method=ranking_method
)
self._update_distribution(gradients)
fill_and_eval_pop()
def _update_distribution(self, gradients: dict):
# This is where we follow the gradients with the help of the stored Distribution object.
# First, we check whether or not we will need to do a controlled update on the
# standard deviation (do we have imposed lower and upper bounds for the standard deviation,
# and do we have a maximum change limiter?)
controlled_stdev_update = (
(self._stdev_min is not None) or (self._stdev_max is not None) or (self._stdev_max_change is not None)
)
if controlled_stdev_update:
# If the standard deviation update needs to be controlled, we store the standard deviation just before
# the update. We will use this later.
old_sigma = self._distribution.sigma
# Here, we determine for which distribution parameter we have a learning rate and for which distribution
# parameter we have an optimizer.
learning_rates = {}
optimizers = {}
if self._optimizer is not None:
# If there is an optimizer, then we declare that "mu" has an optimizer
optimizers["mu"] = self._optimizer
else:
# If we do not have an optimizer, then we declare that "mu" has a raw learning rate coefficient
learning_rates["mu"] = self._center_learning_rate
# Here, we declare that "sigma" has a learning rate
learning_rates["sigma"] = self._stdev_learning_rate
# With the help of the Distribution object's `update_parameters(...)` method, we follow the gradients
updated_dist = self._distribution.update_parameters(
gradients, learning_rates=learning_rates, optimizers=optimizers
)
if controlled_stdev_update:
# If our standard deviation update needs to be controlled, then, considering the pre-update
# standard deviation, we ensure that the update constraints (lower and upper bounds and maximum change)
# are not violated.
updated_dist = updated_dist.modified_copy(
sigma=modify_tensor(
old_sigma,
updated_dist.sigma,
lb=self._stdev_min,
ub=self._stdev_max,
max_change=self._stdev_max_change,
)
)
# Now we can declare that our main distribution is the updated one
self._distribution = updated_dist
def _get_mu(self) -> torch.Tensor:
return self._distribution.parameters["mu"]
def _get_sigma(self) -> torch.Tensor:
return self._distribution.parameters["sigma"]
def _get_mean_eval(self) -> Optional[float]:
if self._population is None:
return self._mean_eval
else:
return float(torch.mean(self._population.evals[:, self._obj_index]))
# def _get_median_eval(self) -> Optional[float]:
# if self._population is None:
# return None
# else:
# return float(torch.median(self._population.evals[:, self._obj_index]))
def _get_popsize(self) -> int:
return 0 if self._population is None else len(self._population)
@property
def population(self) -> Optional[SolutionBatch]:
"""
The population, represented by a SolutionBatch.
If the population is not initialized yet, the retrieved value will
be None.
Also note that, if this algorithm is in distributed mode, the
retrieved value will be None, since the distributed mode causes the
population to be generated in the remote actors, and not in the main
process.
"""
return self._population
@property
def obj_index(self) -> int:
"""
Index of the focused objective
"""
return self._obj_index
obj_index: int
property
readonly
¶
Index of the focused objective
population: Optional[evotorch.core.SolutionBatch]
property
readonly
¶
The population, represented by a SolutionBatch.
If the population is not initialized yet, the retrieved value will be None. Also note that, if this algorithm is in distributed mode, the retrieved value will be None, since the distributed mode causes the population to be generated in the remote actors, and not in the main process.
PGPE (GaussianSearchAlgorithm)
¶
This implementation is the symmetric-sampling variant proposed in the paper Sehnke et al. (2010).
Inspired by the PGPE implementations used in the studies of Ha (2017, 2019), and by the evolution strategy variant of Salimans et al. (2017), this PGPE implementation uses 0-centered ranking by default. The default optimizer for this PGPE implementation is ClipUp (Toklu et al., 2020).
References:
Frank Sehnke, Christian Osendorfer, Thomas Ruckstiess,
Alex Graves, Jan Peters, Jurgen Schmidhuber (2010).
Parameter-exploring Policy Gradients.
Neural Networks 23(4), 551-559.
David Ha (2017). Evolving Stable Strategies.
<http://blog.otoro.net/2017/11/12/evolving-stable-strategies/>
Salimans, T., Ho, J., Chen, X., Sidor, S. and Sutskever, I. (2017).
Evolution Strategies as a Scalable Alternative to
Reinforcement Learning.
David Ha (2019). Reinforcement Learning for Improving Agent Design.
Artificial life 25 (4), 352-365.
Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
ClipUp: A Simple and Powerful Optimizer
for Distribution-based Policy Evolution.
Parallel Problem Solving from Nature (PPSN 2020).
Source code in evotorch/algorithms/distributed/gaussian.py
class PGPE(GaussianSearchAlgorithm):
"""
PGPE: Policy gradient with parameter-based exploration.
This implementation is the symmetric-sampling variant proposed
in the paper Sehnke et al. (2010).
Inspired by the PGPE implementations used in the studies
of Ha (2017, 2019), and by the evolution strategy variant of
Salimans et al. (2017), this PGPE implementation uses 0-centered
ranking by default.
The default optimizer for this PGPE implementation is ClipUp
(Toklu et al., 2020).
References:
Frank Sehnke, Christian Osendorfer, Thomas Ruckstiess,
Alex Graves, Jan Peters, Jurgen Schmidhuber (2010).
Parameter-exploring Policy Gradients.
Neural Networks 23(4), 551-559.
David Ha (2017). Evolving Stable Strategies.
<http://blog.otoro.net/2017/11/12/evolving-stable-strategies/>
Salimans, T., Ho, J., Chen, X., Sidor, S. and Sutskever, I. (2017).
Evolution Strategies as a Scalable Alternative to
Reinforcement Learning.
David Ha (2019). Reinforcement Learning for Improving Agent Design.
Artificial life 25 (4), 352-365.
Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
ClipUp: A Simple and Powerful Optimizer
for Distribution-based Policy Evolution.
Parallel Problem Solving from Nature (PPSN 2020).
"""
DISTRIBUTION_TYPE = NotImplemented # To be filled by the PGPE instance
DISTRIBUTION_PARAMS = NotImplemented # To be filled by the PGPE instance
def __init__(
self,
problem: Problem,
*,
popsize: int,
center_learning_rate: float,
stdev_learning_rate: float,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer="clipup",
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = "centered",
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[RealOrVector] = 0.2,
symmetric: bool = True,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the PGPE algorithm.
Args:
problem: The problem object which is being worked on.
The problem must have its dtype defined
(which means it works on Solution objects,
not with custom Solution objects).
Also, the problem must be single-objective.
popsize: The population size.
In the case of PGPE, `popsize` is expected as an even number
in non-distributed mode. In distributed mode, PGPE will
ensure that each sub-population size assigned to a remote
actor is an even number.
This behavior is because PGPE does symmetric sampling
(i.e. solutions are sampled in pairs).
center_learning_rate: The learning rate for the center
of the search distribution.
stdev_learning_rate: The learning rate for the standard
deviation values of the search distribution.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
optimizer: The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given `center_learning_rate`.
This maximum speed can be configured by passing
`{"max_speed": ...}` to `optimizer_config`.
optimizer_config: Configuration which will be passed
to the optimizer as keyword arguments.
See `evotorch.optimizers` for details about
which optimizer accepts which keyword arguments.
ranking_method: Which ranking method will be used for
fitness shaping. See the documentation of
`evotorch.ranking.rank(...)` for details.
As in the study of Salimans et al. (2017),
the default is 'centered'.
Can be given as None if no such ranking is required.
center_init: The initial center solution.
Can be left as None.
stdev_min: Lower bound for the standard deviation value/array.
Can be given as a real number, or as an array of real numbers.
stdev_max: Upper bound for the standard deviation value/array.
Can be given as a real number, or as an array of real numbers.
stdev_max_change: The maximum update ratio allowed on the
standard deviation. Expected as None if no such limiter
is needed, or as a real number within 0.0 and 1.0 otherwise.
Like in the implementation of Ha (2017, 2018),
the default value for this setting is 0.2, meaning that
the update on the standard deviation values can not be
more than 20% of their original values.
device: The device in which the population is to be stored.
The default value is 'cpu'.
symmetric: Whether or not the solutions will be sampled
in a symmetric/mirrored/antithetic manner.
The default is True.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
if symmetric:
self.DISTRIBUTION_TYPE = SymmetricSeparableGaussian
divide_by = "num_directions"
else:
self.DISTRIBUTION_TYPE = SeparableGaussian
divide_by = "num_solutions"
self.DISTRIBUTION_PARAMS = {"divide_mu_grad_by": divide_by, "divide_sigma_grad_by": divide_by}
super().__init__(
problem,
popsize=popsize,
center_learning_rate=center_learning_rate,
stdev_learning_rate=stdev_learning_rate,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=optimizer,
optimizer_config=optimizer_config,
ranking_method=ranking_method,
center_init=center_init,
stdev_min=stdev_min,
stdev_max=stdev_max,
stdev_max_change=stdev_max_change,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
ensure_even_popsize=symmetric,
)
__init__(self, problem, *, popsize, center_learning_rate, stdev_learning_rate, stdev_init=None, radius_init=None, num_interactions=None, popsize_max=None, optimizer='clipup', optimizer_config=None, ranking_method='centered', center_init=None, stdev_min=None, stdev_max=None, stdev_max_change=0.2, symmetric=True, obj_index=None, distributed=False, popsize_weighted_grad_avg=None)
special
¶
__init__(...)
: Initialize the PGPE algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. The problem must have its dtype defined (which means it works on Solution objects, not with custom Solution objects). Also, the problem must be single-objective. |
required |
popsize |
int |
The population size.
In the case of PGPE, |
required |
center_learning_rate |
float |
The learning rate for the center of the search distribution. |
required |
stdev_learning_rate |
float |
The learning rate for the standard deviation values of the search distribution. |
required |
stdev_init |
Union[float, Iterable[float], torch.Tensor] |
The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
radius_init |
Union[float, Iterable[float], torch.Tensor] |
The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
num_interactions |
Optional[int] |
When given as an integer n, it is ensured that a population has interacted with the GymProblem's environment n times. If this target has not been reached yet, then the population is declared too small, and gets extended with more samples, until n amount of interactions is reached. When given as None, popsize is the only configuration affecting the size of a population. |
None |
popsize_max |
Optional[int] |
Having |
None |
optimizer |
The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given |
'clipup' |
|
optimizer_config |
Optional[dict] |
Configuration which will be passed
to the optimizer as keyword arguments.
See |
None |
ranking_method |
Optional[str] |
Which ranking method will be used for
fitness shaping. See the documentation of
|
'centered' |
center_init |
Union[float, Iterable[float], torch.Tensor] |
The initial center solution. Can be left as None. |
None |
stdev_min |
Union[float, Iterable[float], torch.Tensor] |
Lower bound for the standard deviation value/array. Can be given as a real number, or as an array of real numbers. |
None |
stdev_max |
Union[float, Iterable[float], torch.Tensor] |
Upper bound for the standard deviation value/array. Can be given as a real number, or as an array of real numbers. |
None |
stdev_max_change |
Union[float, Iterable[float], torch.Tensor] |
The maximum update ratio allowed on the standard deviation. Expected as None if no such limiter is needed, or as a real number within 0.0 and 1.0 otherwise. Like in the implementation of Ha (2017, 2018), the default value for this setting is 0.2, meaning that the update on the standard deviation values can not be more than 20% of their original values. |
0.2 |
device |
The device in which the population is to be stored. The default value is 'cpu'. |
required | |
symmetric |
bool |
Whether or not the solutions will be sampled in a symmetric/mirrored/antithetic manner. The default is True. |
True |
obj_index |
Optional[int] |
Index of the objective according to which the gradient estimations will be done. For single-objective problems, this can be left as None. |
None |
distributed |
bool |
Whether or not the gradient computation will
be distributed. If |
False |
popsize_weighted_grad_avg |
Optional[bool] |
Only to be used in distributed mode.
(where being in distributed mode means |
None |
Source code in evotorch/algorithms/distributed/gaussian.py
def __init__(
self,
problem: Problem,
*,
popsize: int,
center_learning_rate: float,
stdev_learning_rate: float,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer="clipup",
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = "centered",
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[RealOrVector] = 0.2,
symmetric: bool = True,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the PGPE algorithm.
Args:
problem: The problem object which is being worked on.
The problem must have its dtype defined
(which means it works on Solution objects,
not with custom Solution objects).
Also, the problem must be single-objective.
popsize: The population size.
In the case of PGPE, `popsize` is expected as an even number
in non-distributed mode. In distributed mode, PGPE will
ensure that each sub-population size assigned to a remote
actor is an even number.
This behavior is because PGPE does symmetric sampling
(i.e. solutions are sampled in pairs).
center_learning_rate: The learning rate for the center
of the search distribution.
stdev_learning_rate: The learning rate for the standard
deviation values of the search distribution.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
optimizer: The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given `center_learning_rate`.
This maximum speed can be configured by passing
`{"max_speed": ...}` to `optimizer_config`.
optimizer_config: Configuration which will be passed
to the optimizer as keyword arguments.
See `evotorch.optimizers` for details about
which optimizer accepts which keyword arguments.
ranking_method: Which ranking method will be used for
fitness shaping. See the documentation of
`evotorch.ranking.rank(...)` for details.
As in the study of Salimans et al. (2017),
the default is 'centered'.
Can be given as None if no such ranking is required.
center_init: The initial center solution.
Can be left as None.
stdev_min: Lower bound for the standard deviation value/array.
Can be given as a real number, or as an array of real numbers.
stdev_max: Upper bound for the standard deviation value/array.
Can be given as a real number, or as an array of real numbers.
stdev_max_change: The maximum update ratio allowed on the
standard deviation. Expected as None if no such limiter
is needed, or as a real number within 0.0 and 1.0 otherwise.
Like in the implementation of Ha (2017, 2018),
the default value for this setting is 0.2, meaning that
the update on the standard deviation values can not be
more than 20% of their original values.
device: The device in which the population is to be stored.
The default value is 'cpu'.
symmetric: Whether or not the solutions will be sampled
in a symmetric/mirrored/antithetic manner.
The default is True.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
if symmetric:
self.DISTRIBUTION_TYPE = SymmetricSeparableGaussian
divide_by = "num_directions"
else:
self.DISTRIBUTION_TYPE = SeparableGaussian
divide_by = "num_solutions"
self.DISTRIBUTION_PARAMS = {"divide_mu_grad_by": divide_by, "divide_sigma_grad_by": divide_by}
super().__init__(
problem,
popsize=popsize,
center_learning_rate=center_learning_rate,
stdev_learning_rate=stdev_learning_rate,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=optimizer,
optimizer_config=optimizer_config,
ranking_method=ranking_method,
center_init=center_init,
stdev_min=stdev_min,
stdev_max=stdev_max,
stdev_max_change=stdev_max_change,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
ensure_even_popsize=symmetric,
)
SNES (GaussianSearchAlgorithm)
¶
Inspired by the implementation at: http://schaul.site44.com/code/snes.py
Reference:
Schaul, T., Glasmachers, T., Schmidhuber, J. (2011).
High Dimensions and Heavy Tails for Natural Evolution Strategies.
Proceedings of the 13th annual conference on Genetic and evolutionary
computation (GECCO 2011).
Source code in evotorch/algorithms/distributed/gaussian.py
class SNES(GaussianSearchAlgorithm):
"""
SNES: Separable Natural Evolution Strategies
Inspired by the implementation at: http://schaul.site44.com/code/snes.py
Reference:
Schaul, T., Glasmachers, T., Schmidhuber, J. (2011).
High Dimensions and Heavy Tails for Natural Evolution Strategies.
Proceedings of the 13th annual conference on Genetic and evolutionary
computation (GECCO 2011).
"""
DISTRIBUTION_TYPE = ExpSeparableGaussian
DISTRIBUTION_PARAMS = None
def __init__(
self,
problem: Problem,
*,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
popsize: Optional[int] = None,
center_learning_rate: Optional[float] = None,
stdev_learning_rate: Optional[float] = None,
scale_learning_rate: bool = True,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer=None,
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = "nes",
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[RealOrVector] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the SNES algorithm.
Args:
problem: The problem object which is being worked on.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
popsize: Population size. Can be specified as an int,
or can be left as None to let the solver decide.
In the case of SNES, `popsize` can be left as None,
in which case the default `popsize` will be computed
as `4 + floor(3 * log(n))` where `n` is the length
of a solution.
center_learning_rate: Learning rate for updating the mean
of the search distribution. Default value is 1.0
stdev_learning_rate: Learning rate for updating the covariance
matrix of the search distribution.
The default value is `0.2 * (3 + log(n)) / sqrt(n)`
where `n` is the length of a solution.
scale_learning_rate: For SNES, there is a default standard
deviation learning rate value which is computed as
`0.2 * (3 + log(n)) / sqrt(n)` (where `n` is the solution
length).
If scale_learning_rate is True (which is the default),
then the effective learning rate for the standard deviation
becomes the provided `stdev_learning_rate` multiplied by this
default value. If `scale_learning_rate` is False, then the
effective standard deviation learning rate becomes
equal to the provided `stdev_learning_rate` value.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
optimizer: The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given `center_learning_rate`.
This maximum speed can be configured by passing
`{"max_speed": ...}` to `optimizer_config`.
optimizer_config: Configuration which will be passed
to the optimizer as keyword arguments.
See `evotorch.optimizers` for details about
which optimizer accepts which keyword arguments.
ranking_method: Which ranking method will be used for
fitness shaping. See the documentation of
`evotorch.ranking.rank(...)` for details.
As in the study of Salimans et al. (2017),
the default is 'centered'.
Can be given as None if no such ranking is required.
center_init: The initial center solution.
Can be left as None.
stdev_min: Minimum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max: Maximum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max_change: Maximum change allowed for when updating
the square roort of the covariance matrix.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
if popsize is None:
popsize = int(4 + math.floor(3 * math.log(problem.solution_length)))
if center_learning_rate is None:
center_learning_rate = 1.0
def default_stdev_lr():
n = problem.solution_length
return 0.2 * (3 + math.log(n)) / math.sqrt(n)
if stdev_learning_rate is None:
stdev_learning_rate = default_stdev_lr()
else:
stdev_learning_rate = float(stdev_learning_rate)
if scale_learning_rate:
stdev_learning_rate *= default_stdev_lr()
super().__init__(
problem,
popsize=popsize,
center_learning_rate=center_learning_rate,
stdev_learning_rate=stdev_learning_rate,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=optimizer,
optimizer_config=optimizer_config,
ranking_method=ranking_method,
center_init=center_init,
stdev_min=stdev_min,
stdev_max=stdev_max,
stdev_max_change=stdev_max_change,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
)
DISTRIBUTION_TYPE (SeparableGaussian)
¶
exponentialseparable Multivariate Gaussian, as used by SNES
Source code in evotorch/algorithms/distributed/gaussian.py
class ExpSeparableGaussian(SeparableGaussian):
"""exponentialseparable Multivariate Gaussian, as used by SNES"""
MANDATORY_PARAMETERS = {"mu", "sigma"}
OPTIONAL_PARAMETERS = set()
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
if ranking_used != "nes":
weights = weights / torch.sum(torch.abs(weights))
scaled_noises = samples - self.mu
raw_noises = scaled_noises / self.sigma
mu_grad = total(dot(weights, scaled_noises))
sigma_grad = total(dot(weights, (raw_noises**2) - 1))
return {"mu": mu_grad, "sigma": sigma_grad}
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpSeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma * torch.exp(
0.5 * self._follow_gradient("sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers)
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
ExpSeparableGaussian |
The updated copy of the distribution. |
Source code in evotorch/algorithms/distributed/gaussian.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpSeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma * torch.exp(
0.5 * self._follow_gradient("sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers)
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
__init__(self, problem, *, stdev_init=None, radius_init=None, popsize=None, center_learning_rate=None, stdev_learning_rate=None, scale_learning_rate=True, num_interactions=None, popsize_max=None, optimizer=None, optimizer_config=None, ranking_method='nes', center_init=None, stdev_min=None, stdev_max=None, stdev_max_change=None, obj_index=None, distributed=False, popsize_weighted_grad_avg=None)
special
¶
__init__(...)
: Initialize the SNES algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
stdev_init |
Union[float, Iterable[float], torch.Tensor] |
The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
radius_init |
Union[float, Iterable[float], torch.Tensor] |
The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
popsize |
Optional[int] |
Population size. Can be specified as an int,
or can be left as None to let the solver decide.
In the case of SNES, |
None |
center_learning_rate |
Optional[float] |
Learning rate for updating the mean of the search distribution. Default value is 1.0 |
None |
stdev_learning_rate |
Optional[float] |
Learning rate for updating the covariance
matrix of the search distribution.
The default value is |
None |
scale_learning_rate |
bool |
For SNES, there is a default standard
deviation learning rate value which is computed as
|
True |
num_interactions |
Optional[int] |
When given as an integer n, it is ensured that a population has interacted with the GymProblem's environment n times. If this target has not been reached yet, then the population is declared too small, and gets extended with more samples, until n amount of interactions is reached. When given as None, popsize is the only configuration affecting the size of a population. |
None |
popsize_max |
Optional[int] |
Having |
None |
num_interactions |
Optional[int] |
When given as an integer n, it is ensured that a population has interacted with the GymProblem's environment n times. If this target has not been reached yet, then the population is declared too small, and gets extended with more samples, until n amount of interactions is reached. When given as None, popsize is the only configuration affecting the size of a population. |
None |
popsize_max |
Optional[int] |
Having |
None |
optimizer |
The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given |
None |
|
optimizer_config |
Optional[dict] |
Configuration which will be passed
to the optimizer as keyword arguments.
See |
None |
ranking_method |
Optional[str] |
Which ranking method will be used for
fitness shaping. See the documentation of
|
'nes' |
center_init |
Union[float, Iterable[float], torch.Tensor] |
The initial center solution. Can be left as None. |
None |
stdev_min |
Union[float, Iterable[float], torch.Tensor] |
Minimum values for the standard deviation. Expected as a 1-dimensional array to serve as a limiter to the diagonals of the covariance matrix's square root. |
None |
stdev_max |
Union[float, Iterable[float], torch.Tensor] |
Maximum values for the standard deviation. Expected as a 1-dimensional array to serve as a limiter to the diagonals of the covariance matrix's square root. |
None |
stdev_max_change |
Union[float, Iterable[float], torch.Tensor] |
Maximum change allowed for when updating the square roort of the covariance matrix. |
None |
obj_index |
Optional[int] |
Index of the objective according to which the gradient estimations will be done. For single-objective problems, this can be left as None. |
None |
distributed |
bool |
Whether or not the gradient computation will
be distributed. If |
False |
popsize_weighted_grad_avg |
Optional[bool] |
Only to be used in distributed mode.
(where being in distributed mode means |
None |
Source code in evotorch/algorithms/distributed/gaussian.py
def __init__(
self,
problem: Problem,
*,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
popsize: Optional[int] = None,
center_learning_rate: Optional[float] = None,
stdev_learning_rate: Optional[float] = None,
scale_learning_rate: bool = True,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer=None,
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = "nes",
center_init: Optional[RealOrVector] = None,
stdev_min: Optional[RealOrVector] = None,
stdev_max: Optional[RealOrVector] = None,
stdev_max_change: Optional[RealOrVector] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the SNES algorithm.
Args:
problem: The problem object which is being worked on.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
popsize: Population size. Can be specified as an int,
or can be left as None to let the solver decide.
In the case of SNES, `popsize` can be left as None,
in which case the default `popsize` will be computed
as `4 + floor(3 * log(n))` where `n` is the length
of a solution.
center_learning_rate: Learning rate for updating the mean
of the search distribution. Default value is 1.0
stdev_learning_rate: Learning rate for updating the covariance
matrix of the search distribution.
The default value is `0.2 * (3 + log(n)) / sqrt(n)`
where `n` is the length of a solution.
scale_learning_rate: For SNES, there is a default standard
deviation learning rate value which is computed as
`0.2 * (3 + log(n)) / sqrt(n)` (where `n` is the solution
length).
If scale_learning_rate is True (which is the default),
then the effective learning rate for the standard deviation
becomes the provided `stdev_learning_rate` multiplied by this
default value. If `scale_learning_rate` is False, then the
effective standard deviation learning rate becomes
equal to the provided `stdev_learning_rate` value.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
optimizer: The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given `center_learning_rate`.
This maximum speed can be configured by passing
`{"max_speed": ...}` to `optimizer_config`.
optimizer_config: Configuration which will be passed
to the optimizer as keyword arguments.
See `evotorch.optimizers` for details about
which optimizer accepts which keyword arguments.
ranking_method: Which ranking method will be used for
fitness shaping. See the documentation of
`evotorch.ranking.rank(...)` for details.
As in the study of Salimans et al. (2017),
the default is 'centered'.
Can be given as None if no such ranking is required.
center_init: The initial center solution.
Can be left as None.
stdev_min: Minimum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max: Maximum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max_change: Maximum change allowed for when updating
the square roort of the covariance matrix.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
if popsize is None:
popsize = int(4 + math.floor(3 * math.log(problem.solution_length)))
if center_learning_rate is None:
center_learning_rate = 1.0
def default_stdev_lr():
n = problem.solution_length
return 0.2 * (3 + math.log(n)) / math.sqrt(n)
if stdev_learning_rate is None:
stdev_learning_rate = default_stdev_lr()
else:
stdev_learning_rate = float(stdev_learning_rate)
if scale_learning_rate:
stdev_learning_rate *= default_stdev_lr()
super().__init__(
problem,
popsize=popsize,
center_learning_rate=center_learning_rate,
stdev_learning_rate=stdev_learning_rate,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=optimizer,
optimizer_config=optimizer_config,
ranking_method=ranking_method,
center_init=center_init,
stdev_min=stdev_min,
stdev_max=stdev_max,
stdev_max_change=stdev_max_change,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
)
XNES (GaussianSearchAlgorithm)
¶
Inspired by the implementation at: http://schaul.site44.com/code/xnes.py https://github.com/pybrain/pybrain/blob/master/pybrain/optimization/distributionbased/xnes.py
Reference
Glasmachers, Tobias, et al. Exponential natural evolution strategies. Proceedings of the 12th annual conference on Genetic and evolutionary computation (GECCO 2010).
Source code in evotorch/algorithms/distributed/gaussian.py
class XNES(GaussianSearchAlgorithm):
"""
XNES: Exponential Natural Evolution Strategies
Inspired by the implementation at:
http://schaul.site44.com/code/xnes.py
https://github.com/pybrain/pybrain/blob/master/pybrain/optimization/distributionbased/xnes.py
Reference:
Glasmachers, Tobias, et al.
Exponential natural evolution strategies.
Proceedings of the 12th annual conference on Genetic and evolutionary
computation (GECCO 2010).
"""
DISTRIBUTION_TYPE = ExpGaussian
DISTRIBUTION_PARAMS = None
def __init__(
self,
problem: Problem,
*,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
popsize: Optional[int] = None,
center_learning_rate: Optional[float] = None,
stdev_learning_rate: Optional[float] = None,
scale_learning_rate: bool = True,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer=None,
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = "nes",
center_init: Optional[RealOrVector] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the XNES algorithm.
Args:
problem: The problem object which is being worked on.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
popsize: Population size. Can be specified as an int,
or can be left as None to let the solver decide.
In the case of SNES, `popsize` can be left as None,
in which case the default `popsize` will be computed
as `4 + floor(3 * log(n))` where `n` is the length
of a solution.
center_learning_rate: Learning rate for updating the mean
of the search distribution. Default value is 1.0
stdev_learning_rate: Learning rate for updating the covariance
matrix of the search distribution.
The default value is `0.6 * (3 + log(n)) / (n * sqrt(n))`
where `n` is the length of a solution.
scale_learning_rate: For SNES, there is a default standard
deviation learning rate value which is computed as
`0.6 * (3 + log(n)) / (n * sqrt(n))` (where `n` is the solution
length).
If scale_learning_rate is True (which is the default),
then the effective learning rate for the standard deviation
becomes the provided `stdev_learning_rate` multiplied by this
default value. If `scale_learning_rate` is False, then the
effective standard deviation learning rate becomes
equal to the provided `stdev_learning_rate` value.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
optimizer: The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given `center_learning_rate`.
This maximum speed can be configured by passing
`{"max_speed": ...}` to `optimizer_config`.
optimizer_config: Configuration which will be passed
to the optimizer as keyword arguments.
See `evotorch.optimizers` for details about
which optimizer accepts which keyword arguments.
ranking_method: Which ranking method will be used for
fitness shaping. See the documentation of
`evotorch.ranking.rank(...)` for details.
As in the study of Salimans et al. (2017),
the default is 'centered'.
Can be given as None if no such ranking is required.
center_init: The initial center solution.
Can be left as None.
stdev_min: Minimum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max: Maximum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max_change: Maximum change allowed for when updating
the square roort of the covariance matrix.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
if popsize is None:
popsize = int(4 + math.floor(3 * math.log(problem.solution_length)))
if center_learning_rate is None:
center_learning_rate = 1.0
def default_stdev_lr():
n = problem.solution_length
return 0.6 * (3 + math.log(n)) / (n * math.sqrt(n))
if stdev_learning_rate is None:
stdev_learning_rate = default_stdev_lr()
else:
stdev_learning_rate = float(stdev_learning_rate)
if scale_learning_rate:
stdev_learning_rate *= default_stdev_lr()
super().__init__(
problem,
popsize=popsize,
center_learning_rate=center_learning_rate,
stdev_learning_rate=stdev_learning_rate,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=optimizer,
optimizer_config=optimizer_config,
ranking_method=ranking_method,
center_init=center_init,
stdev_min=None,
stdev_max=None,
stdev_max_change=None,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
)
DISTRIBUTION_TYPE (Distribution)
¶
exponential Multivariate Gaussian, as used by XNES
Source code in evotorch/algorithms/distributed/gaussian.py
class ExpGaussian(Distribution):
"""exponential Multivariate Gaussian, as used by XNES"""
# Corresponding to mu and A in symbols used in xNES paper
MANDATORY_PARAMETERS = {"mu", "sigma"}
# Inverse of sigma, numerically more stable to track this independently to sigma
OPTIONAL_PARAMETERS = {"sigma_inv"}
def __init__(
self,
parameters: dict,
*,
solution_length: Optional[int] = None,
device: Optional[Device] = None,
dtype: Optional[DType] = None,
):
[mu_length] = parameters["mu"].shape
# Make sigma 2D
if len(parameters["sigma"].shape) == 1:
parameters["sigma"] = torch.diag(parameters["sigma"])
# Automatically generate sigma_inv if not provided
if "sigma_inv" not in parameters:
parameters["sigma_inv"] = torch.inverse(parameters["sigma"])
[sigma_length, _] = parameters["sigma"].shape
if solution_length is None:
solution_length = mu_length
else:
if solution_length != mu_length:
raise ValueError(
f"The argument `solution_length` does not match the length of `mu` provided in `parameters`."
f" solution_length={solution_length},"
f' parameters["mu"]={mu_length}.'
)
if mu_length != sigma_length:
raise ValueError(
f"The tensors `mu` and `sigma` provided within `parameters` have mismatching lengths."
f' parameters["mu"]={mu_length},'
f' parameters["sigma"]={sigma_length}.'
)
super().__init__(
solution_length=solution_length,
parameters=parameters,
device=device,
dtype=dtype,
)
# Make identity matrix as this is used throughout in gradient computation
self.eye = self.make_zeros((solution_length, solution_length))
self.eye[range(self.solution_length), range(self.solution_length)] = 1.0
@property
def mu(self) -> torch.Tensor:
"""Getter for mu
Returns:
mu (torch.Tensor): The center of the search distribution
"""
return self.parameters["mu"]
@mu.setter
def mu(self, new_mu: Iterable):
"""Setter for mu
Args:
new_mu (torch.Tensor): The new value of mu
"""
self.parameters["mu"] = torch.as_tensor(new_mu, dtype=self.dtype, device=self.device)
@property
def cov(self) -> torch.Tensor:
"""The covariance matrix A^T A"""
return self.sigma.transpose(0, 1) @ self.sigma
@property
def sigma(self) -> torch.Tensor:
"""Getter for sigma
Returns:
sigma (torch.Tensor): The square root of the covariance matrix
"""
return self.parameters["sigma"]
@property
def sigma_inv(self) -> torch.Tensor:
"""Getter for sigma_inv
Returns:
sigma_inv (torch.Tensor): The inverse square root of the covariance matrix
"""
if "sigma_inv" in self.parameters:
return self.parameters["sigma_inv"]
else:
return torch.inverse(self.parameters["sigma"])
@property
def A(self) -> torch.Tensor:
"""Alias for self.sigma, for notational consistency with paper"""
return self.sigma
@property
def A_inv(self) -> torch.Tensor:
"""Alias for self.sigma_inv, for notational consistency with paper"""
return self.sigma_inv
@sigma.setter
def sigma(self, new_sigma: Iterable):
"""Setter for sigma
Args:
new_sigma (torch.Tensor): The new value of sigma, the square root of the covariance matrix
"""
self.parameters["sigma"] = torch.as_tensor(new_sigma, dtype=self.dtype, device=self.device)
def to_global_coordinates(self, local_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from local coordinate space N(0, I_d) to global coordinate space N(mu, A^T A)
This function is the inverse of to_local_coordinates
Args:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
Returns:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# We use transpose here to simplify the batched application of A
return self.mu.unsqueeze(0) + (self.A @ local_coordinates.T).T
def to_local_coordinates(self, global_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from global coordinate space N(mu, A^T A) to local coordinate space N(0, I_d)
This function is the inverse of to_global_coordinates
Args:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
Returns:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# Therefore, we can recover z according to z = A_inv (x - mu)
return (self.A_inv @ (global_coordinates - self.mu.unsqueeze(0)).T).T
def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
"""Fill a tensor with samples from N(mu, A^T A)
Args:
out (torch.Tensor): The tensor to fill
generator (Optional[torch.Generator]): A generator to use to generate random values
"""
# Fill with local coordinates from N(0, I_d)
self.make_gaussian(out=out, generator=generator)
# Map local coordinates to global coordinate system
out[:] = self.to_global_coordinates(out)
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
"""Compute the gradients with respect to a given set of samples and weights
Args:
samples (torch.Tensor): Samples drawn from N(mu, A^T A), ideally using self._fill
weights (torch.Tensor): Weights e.g. fitnesses or utilities assigned to samples
ranking_used (optional[str]): The ranking method used to compute weights
Returns:
grads (dict): A dictionary containing the approximated natural gradient on d and M
"""
# Compute the local coordinates
local_coordinates = self.to_local_coordinates(samples)
# Make sure that the weights (utilities) are 0-centered
# (Otherwise the formulations would have to consider a bias term)
if ranking_used not in ("centered", "normalized"):
weights = weights - torch.mean(weights)
d_grad = total(dot(weights, local_coordinates))
local_coordinates_outer = local_coordinates.unsqueeze(1) * local_coordinates.unsqueeze(2)
M_grad = torch.sum(
weights.unsqueeze(-1).unsqueeze(-1) * (local_coordinates_outer - self.eye.unsqueeze(0)), dim=0
)
return {
"d": d_grad,
"M": M_grad,
}
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpGaussian":
d_grad = gradients["d"]
M_grad = gradients["M"]
if "d" not in learning_rates:
learning_rates["d"] = learning_rates["mu"]
if "M" not in learning_rates:
learning_rates["M"] = learning_rates["sigma"]
# Follow gradients for d, and M
update_d = self._follow_gradient("d", d_grad, learning_rates=learning_rates, optimizers=optimizers)
update_M = self._follow_gradient("M", M_grad, learning_rates=learning_rates, optimizers=optimizers)
# Fold into parameters mu, A and A inv
new_mu = self.mu + torch.mv(self.A, update_d)
new_A = self.A @ torch.matrix_exp(0.5 * update_M)
new_A_inv = torch.matrix_exp(-0.5 * update_M) @ self.A_inv
# Return modified distribution
return self.modified_copy(mu=new_mu, sigma=new_A, sigma_inv=new_A_inv)
A: Tensor
property
readonly
¶Alias for self.sigma, for notational consistency with paper
A_inv: Tensor
property
readonly
¶Alias for self.sigma_inv, for notational consistency with paper
cov: Tensor
property
readonly
¶The covariance matrix A^T A
mu: Tensor
property
writable
¶Getter for mu
Returns:
Type | Description |
---|---|
mu (torch.Tensor) |
The center of the search distribution |
sigma: Tensor
property
writable
¶Getter for sigma
Returns:
Type | Description |
---|---|
sigma (torch.Tensor) |
The square root of the covariance matrix |
sigma_inv: Tensor
property
readonly
¶Getter for sigma_inv
Returns:
Type | Description |
---|---|
sigma_inv (torch.Tensor) |
The inverse square root of the covariance matrix |
to_global_coordinates(self, local_coordinates)
¶Map samples from local coordinate space N(0, I_d) to global coordinate space N(mu, A^T A) This function is the inverse of to_local_coordinates
Parameters:
Name | Type | Description | Default |
---|---|---|---|
local_coordinates |
torch.Tensor |
The local coordinates sampled from N(0, I_d) |
required |
Returns:
Type | Description |
---|---|
global_coordinates (torch.Tensor) |
The global coordinates sampled from N(mu, A^T A) |
Source code in evotorch/algorithms/distributed/gaussian.py
def to_global_coordinates(self, local_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from local coordinate space N(0, I_d) to global coordinate space N(mu, A^T A)
This function is the inverse of to_local_coordinates
Args:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
Returns:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# We use transpose here to simplify the batched application of A
return self.mu.unsqueeze(0) + (self.A @ local_coordinates.T).T
to_local_coordinates(self, global_coordinates)
¶Map samples from global coordinate space N(mu, A^T A) to local coordinate space N(0, I_d) This function is the inverse of to_global_coordinates
Parameters:
Name | Type | Description | Default |
---|---|---|---|
global_coordinates |
torch.Tensor |
The global coordinates sampled from N(mu, A^T A) |
required |
Returns:
Type | Description |
---|---|
local_coordinates (torch.Tensor) |
The local coordinates sampled from N(0, I_d) |
Source code in evotorch/algorithms/distributed/gaussian.py
def to_local_coordinates(self, global_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from global coordinate space N(mu, A^T A) to local coordinate space N(0, I_d)
This function is the inverse of to_global_coordinates
Args:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
Returns:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# Therefore, we can recover z according to z = A_inv (x - mu)
return (self.A_inv @ (global_coordinates - self.mu.unsqueeze(0)).T).T
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
ExpGaussian |
The updated copy of the distribution. |
Source code in evotorch/algorithms/distributed/gaussian.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpGaussian":
d_grad = gradients["d"]
M_grad = gradients["M"]
if "d" not in learning_rates:
learning_rates["d"] = learning_rates["mu"]
if "M" not in learning_rates:
learning_rates["M"] = learning_rates["sigma"]
# Follow gradients for d, and M
update_d = self._follow_gradient("d", d_grad, learning_rates=learning_rates, optimizers=optimizers)
update_M = self._follow_gradient("M", M_grad, learning_rates=learning_rates, optimizers=optimizers)
# Fold into parameters mu, A and A inv
new_mu = self.mu + torch.mv(self.A, update_d)
new_A = self.A @ torch.matrix_exp(0.5 * update_M)
new_A_inv = torch.matrix_exp(-0.5 * update_M) @ self.A_inv
# Return modified distribution
return self.modified_copy(mu=new_mu, sigma=new_A, sigma_inv=new_A_inv)
__init__(self, problem, *, stdev_init=None, radius_init=None, popsize=None, center_learning_rate=None, stdev_learning_rate=None, scale_learning_rate=True, num_interactions=None, popsize_max=None, optimizer=None, optimizer_config=None, ranking_method='nes', center_init=None, obj_index=None, distributed=False, popsize_weighted_grad_avg=None)
special
¶
__init__(...)
: Initialize the XNES algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
stdev_init |
Union[float, Iterable[float], torch.Tensor] |
The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
radius_init |
Union[float, Iterable[float], torch.Tensor] |
The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument |
None |
popsize |
Optional[int] |
Population size. Can be specified as an int,
or can be left as None to let the solver decide.
In the case of SNES, |
None |
center_learning_rate |
Optional[float] |
Learning rate for updating the mean of the search distribution. Default value is 1.0 |
None |
stdev_learning_rate |
Optional[float] |
Learning rate for updating the covariance
matrix of the search distribution.
The default value is |
None |
scale_learning_rate |
bool |
For SNES, there is a default standard
deviation learning rate value which is computed as
|
True |
num_interactions |
Optional[int] |
When given as an integer n, it is ensured that a population has interacted with the GymProblem's environment n times. If this target has not been reached yet, then the population is declared too small, and gets extended with more samples, until n amount of interactions is reached. When given as None, popsize is the only configuration affecting the size of a population. |
None |
popsize_max |
Optional[int] |
Having |
None |
num_interactions |
Optional[int] |
When given as an integer n, it is ensured that a population has interacted with the GymProblem's environment n times. If this target has not been reached yet, then the population is declared too small, and gets extended with more samples, until n amount of interactions is reached. When given as None, popsize is the only configuration affecting the size of a population. |
None |
optimizer |
The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given |
None |
|
optimizer_config |
Optional[dict] |
Configuration which will be passed
to the optimizer as keyword arguments.
See |
None |
ranking_method |
Optional[str] |
Which ranking method will be used for
fitness shaping. See the documentation of
|
'nes' |
center_init |
Union[float, Iterable[float], torch.Tensor] |
The initial center solution. Can be left as None. |
None |
stdev_min |
Minimum values for the standard deviation. Expected as a 1-dimensional array to serve as a limiter to the diagonals of the covariance matrix's square root. |
required | |
stdev_max |
Maximum values for the standard deviation. Expected as a 1-dimensional array to serve as a limiter to the diagonals of the covariance matrix's square root. |
required | |
stdev_max_change |
Maximum change allowed for when updating the square roort of the covariance matrix. |
required | |
obj_index |
Optional[int] |
Index of the objective according to which the gradient estimations will be done. For single-objective problems, this can be left as None. |
None |
distributed |
bool |
Whether or not the gradient computation will
be distributed. If |
False |
popsize_weighted_grad_avg |
Optional[bool] |
Only to be used in distributed mode.
(where being in distributed mode means |
None |
Source code in evotorch/algorithms/distributed/gaussian.py
def __init__(
self,
problem: Problem,
*,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
popsize: Optional[int] = None,
center_learning_rate: Optional[float] = None,
stdev_learning_rate: Optional[float] = None,
scale_learning_rate: bool = True,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
optimizer=None,
optimizer_config: Optional[dict] = None,
ranking_method: Optional[str] = "nes",
center_init: Optional[RealOrVector] = None,
obj_index: Optional[int] = None,
distributed: bool = False,
popsize_weighted_grad_avg: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the XNES algorithm.
Args:
problem: The problem object which is being worked on.
stdev_init: The initial standard deviation of the search
distribution, expressed as a scalar or as an array.
Determines the initial coverage area of the search
distribution.
If one wishes to configure the coverage area via the
argument `radius_init` instead, then `stdev_init` is expected
as None.
radius_init: The initial radius of the search distribution,
expressed as a scalar.
Determines the initial coverage area of the search
distribution.
Here, "radius" is defined as the norm of the search
distribution.
If one wishes to configure the coverage area via the
argument `stdev_init` instead, then `radius_init` is expected
as None.
popsize: Population size. Can be specified as an int,
or can be left as None to let the solver decide.
In the case of SNES, `popsize` can be left as None,
in which case the default `popsize` will be computed
as `4 + floor(3 * log(n))` where `n` is the length
of a solution.
center_learning_rate: Learning rate for updating the mean
of the search distribution. Default value is 1.0
stdev_learning_rate: Learning rate for updating the covariance
matrix of the search distribution.
The default value is `0.6 * (3 + log(n)) / (n * sqrt(n))`
where `n` is the length of a solution.
scale_learning_rate: For SNES, there is a default standard
deviation learning rate value which is computed as
`0.6 * (3 + log(n)) / (n * sqrt(n))` (where `n` is the solution
length).
If scale_learning_rate is True (which is the default),
then the effective learning rate for the standard deviation
becomes the provided `stdev_learning_rate` multiplied by this
default value. If `scale_learning_rate` is False, then the
effective standard deviation learning rate becomes
equal to the provided `stdev_learning_rate` value.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
popsize_max: Having `num_interactions` set as an integer
might cause the effective population size jump to
unnecesarily large numbers. To prevent this,
one can set `popsize_max` to specify an upper
bound for the effective population size.
num_interactions: When given as an integer n,
it is ensured that a population has interacted with
the GymProblem's environment n times. If this target
has not been reached yet, then the population is declared
too small, and gets extended with more samples,
until n amount of interactions is reached.
When given as None, popsize is the only configuration
affecting the size of a population.
optimizer: The optimizer to be used while following the
estimated the gradients.
Can be given as None if a momentum-based optimizer
is not required.
Otherwise, can be given as a str containing the name
of the optimizer (e.g. 'adam', 'clipup');
or as an instance of evotorch.optimizers.TorchOptimizer
or evotorch.optimizers.ClipUp.
As in the study of Salimans et al. (2017),
the default is 'clipup'.
Note that, for ClipUp, the default maximum speed is set
as twice the given `center_learning_rate`.
This maximum speed can be configured by passing
`{"max_speed": ...}` to `optimizer_config`.
optimizer_config: Configuration which will be passed
to the optimizer as keyword arguments.
See `evotorch.optimizers` for details about
which optimizer accepts which keyword arguments.
ranking_method: Which ranking method will be used for
fitness shaping. See the documentation of
`evotorch.ranking.rank(...)` for details.
As in the study of Salimans et al. (2017),
the default is 'centered'.
Can be given as None if no such ranking is required.
center_init: The initial center solution.
Can be left as None.
stdev_min: Minimum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max: Maximum values for the standard deviation.
Expected as a 1-dimensional array to serve as a limiter
to the diagonals of the covariance matrix's square root.
stdev_max_change: Maximum change allowed for when updating
the square roort of the covariance matrix.
obj_index: Index of the objective according to which the
gradient estimations will be done.
For single-objective problems, this can be left as None.
distributed: Whether or not the gradient computation will
be distributed. If `distributed` is given as False and
the problem is not parallelized, then everything will
be centralized (i.e. the entire computation will happen
in the main process).
If `distributed` is given as False, and the problem
is parallelized, then the population will be created
in the main process and then sent to remote workers
for parallelized evaluation, and then the remote fitnesses
will be collected by the main process again for computing
the search gradients.
If `distributed` is given as True, and the problem
is parallelized, then the search algorithm itself will
be distributed, in the sense that each remote actor will
generate its own population (such that the total population
size across all these actors becomes equal to `popsize`)
and will compute its own gradient, and then the main process
will collect these gradients, compute the averaged gradients
and update the main search distribution.
Non-distributed mode has the advantage of keeping the
population in the main process, which is good when one wishes
to do detailed monitoring during the evolutionary process,
but has the disadvantage of having to pass the solutions to
the remote actors and having to collect fitnesses, which
might result in increased interprocess communication traffic.
On the other hand, while it is not possible to monitor the
population in distributed mode, the distributed mode has the
advantage of significantly reducing the interprocess
communication traffic, since the only things communicated
with the remote actors are the search distributions (not the
solutions) and the gradients.
popsize_weighted_grad_avg: Only to be used in distributed mode.
(where being in distributed mode means `distributed` is given
as True). In distributed mode, each actor remotely samples
its own solution batches and computes its own gradients.
These gradients are then collected, and a final average
gradient is computed.
If `popsize_weighted_grad_avg` is True, then, while averaging
over the gradients, each gradient will have its own weight
that is computed according to how many solutions were sampled
by the actor that produced the gradient.
If `popsize_weighted_grad_avg` is False, then, there will not
be weighted averaging (or, each gradient will have equal
weight).
If `popsize_weighted_grad_avg` is None, then, the gradient
weights will be equal a value for `num_interactions` is given
(because `num_interactions` affects the number of solutions
according to the episode lengths, and popsize-weighting the
gradients could be misleading); and the gradient weights will
be weighted according to the sub-population (i.e. sub-batch)
sizes if `num_interactions` is left as None.
The default value for `popsize_weighted_grad_avg` is None.
When the distributed mode is disabled (i.e. when `distributed`
is False), then the argument `popsize_weighted_grad_avg` is
expected as None.
"""
if popsize is None:
popsize = int(4 + math.floor(3 * math.log(problem.solution_length)))
if center_learning_rate is None:
center_learning_rate = 1.0
def default_stdev_lr():
n = problem.solution_length
return 0.6 * (3 + math.log(n)) / (n * math.sqrt(n))
if stdev_learning_rate is None:
stdev_learning_rate = default_stdev_lr()
else:
stdev_learning_rate = float(stdev_learning_rate)
if scale_learning_rate:
stdev_learning_rate *= default_stdev_lr()
super().__init__(
problem,
popsize=popsize,
center_learning_rate=center_learning_rate,
stdev_learning_rate=stdev_learning_rate,
stdev_init=stdev_init,
radius_init=radius_init,
popsize_max=popsize_max,
num_interactions=num_interactions,
optimizer=optimizer,
optimizer_config=optimizer_config,
ranking_method=ranking_method,
center_init=center_init,
stdev_min=None,
stdev_max=None,
stdev_max_change=None,
obj_index=obj_index,
distributed=distributed,
popsize_weighted_grad_avg=popsize_weighted_grad_avg,
)
ga
¶
Genetic algorithm variants: SteadyStateGA, Cosyne.
Cosyne (SearchAlgorithm, SinglePopulationAlgorithmMixin)
¶
Implementation of the CoSyNE algorithm.
References:
F.Gomez, J.Schmidhuber, R.Miikkulainen, M.Mitchell (2008).
Accelerated Neural Evolution through Cooperatively Coevolved Synapses.
Journal of Machine Learning Research 9 (5).
Source code in evotorch/algorithms/ga.py
class Cosyne(SearchAlgorithm, SinglePopulationAlgorithmMixin):
"""
Implementation of the CoSyNE algorithm.
References:
F.Gomez, J.Schmidhuber, R.Miikkulainen, M.Mitchell (2008).
Accelerated Neural Evolution through Cooperatively Coevolved Synapses.
Journal of Machine Learning Research 9 (5).
"""
def __init__(
self,
problem: Problem,
*,
popsize: int,
tournament_size: int,
mutation_stdev: Optional[float],
mutation_probability: Optional[float],
permute_all: bool = False,
num_elites: Optional[int] = None,
elitism_ratio: Optional[float] = None,
eta: Optional[float] = None,
num_children: Optional[int] = None,
):
"""
`__init__(...)`: Initialize the Cosyne instance.
Args:
problem: The problem object to work on.
popsize: Population size, as an integer.
tournament_size: Tournament size, for tournament selection.
mutation_stdev: Standard deviation of the Gaussian mutation.
mutation_probability: Elementwise Gaussian mutation probability.
permute_all: If given as True, all solutions are subject to
permutation. If given as False (which is the default),
there will be a selection procedure for each decision
variable.
num_elites: Optionally expected as an integer, specifying the
number of elites to pass to the next generation.
Cannot be used together with the argument `elitism_ratio`.
elitism_ratio: Optionally expected as a real number between
0 and 1, specifying the amount of elites to pass to the
next generation. For example, 0.1 means that the best 10%
of the population are accepted as elites and passed onto
the next generation.
Cannot be used together with the argument `num_elites`.
eta: Optionally expected as an integer, specifying the eta
hyperparameter for the simulated binary cross-over (SBX).
If left as None, one-point cross-over will be used instead.
num_children: Number of children to generate at each iteration.
If left as None, then this number is half of the population
size.
"""
problem.ensure_numeric()
SearchAlgorithm.__init__(self, problem)
if mutation_stdev is None and mutation_probability is None:
self.mutation_op = None
else:
self.mutation_op = GaussianMutation(
self._problem, mutation_probability=float(mutation_probability), stdev=float(mutation_stdev)
)
cross_over_kwargs = {"tournament_size": tournament_size}
if num_children is None:
cross_over_kwargs["cross_over_rate"] = 2.0
else:
cross_over_kwargs["num_children"] = num_children
if eta is None:
self._cross_over_op = OnePointCrossOver(self._problem, **cross_over_kwargs)
else:
self._cross_over_op = SimulatedBinaryCrossOver(self._problem, eta=eta, **cross_over_kwargs)
self._permutation_op = CosynePermutation(self._problem, permute_all=permute_all)
self._popsize = int(popsize)
if num_elites is not None and elitism_ratio is None:
self._num_elites = int(num_elites)
elif num_elites is None and elitism_ratio is not None:
self._num_elites = int(self._popsize * elitism_ratio)
elif num_elites is None and elitism_ratio is None:
self._num_elites = None
else:
raise ValueError(
"Received both `num_elites` and `elitism_ratio`. Please provide only one of them, or none of them."
)
self._population = SolutionBatch(problem, device=problem.device, popsize=self._popsize)
self._first_generation: bool = True
# GAStatusMixin.__init__(self)
SinglePopulationAlgorithmMixin.__init__(self)
@property
def population(self) -> SolutionBatch:
return self._population
def _step(self):
if self._first_generation:
self._first_generation = False
self._problem.evaluate(self._population)
to_merge = []
num_elites = self._num_elites
num_parents = int(self._popsize / 4)
num_relevant = max((0 if num_elites is None else num_elites), num_parents)
sorted_relevant = self._population.take_best(num_relevant)
if self._num_elites is not None and self._num_elites >= 1:
to_merge.append(sorted_relevant[:num_elites].clone())
parents = sorted_relevant[:num_parents]
children = self._cross_over_op(parents)
if self.mutation_op is not None:
children = self.mutation_op(children)
permuted = self._permutation_op(self._population)
to_merge.extend([children, permuted])
extended_population = SolutionBatch(merging_of=to_merge)
self._problem.evaluate(extended_population)
self._population = extended_population.take_best(self._popsize)
__init__(self, problem, *, popsize, tournament_size, mutation_stdev, mutation_probability, permute_all=False, num_elites=None, elitism_ratio=None, eta=None, num_children=None)
special
¶
__init__(...)
: Initialize the Cosyne instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object to work on. |
required |
popsize |
int |
Population size, as an integer. |
required |
tournament_size |
int |
Tournament size, for tournament selection. |
required |
mutation_stdev |
Optional[float] |
Standard deviation of the Gaussian mutation. |
required |
mutation_probability |
Optional[float] |
Elementwise Gaussian mutation probability. |
required |
permute_all |
bool |
If given as True, all solutions are subject to permutation. If given as False (which is the default), there will be a selection procedure for each decision variable. |
False |
num_elites |
Optional[int] |
Optionally expected as an integer, specifying the
number of elites to pass to the next generation.
Cannot be used together with the argument |
None |
elitism_ratio |
Optional[float] |
Optionally expected as a real number between
0 and 1, specifying the amount of elites to pass to the
next generation. For example, 0.1 means that the best 10%
of the population are accepted as elites and passed onto
the next generation.
Cannot be used together with the argument |
None |
eta |
Optional[float] |
Optionally expected as an integer, specifying the eta hyperparameter for the simulated binary cross-over (SBX). If left as None, one-point cross-over will be used instead. |
None |
num_children |
Optional[int] |
Number of children to generate at each iteration. If left as None, then this number is half of the population size. |
None |
Source code in evotorch/algorithms/ga.py
def __init__(
self,
problem: Problem,
*,
popsize: int,
tournament_size: int,
mutation_stdev: Optional[float],
mutation_probability: Optional[float],
permute_all: bool = False,
num_elites: Optional[int] = None,
elitism_ratio: Optional[float] = None,
eta: Optional[float] = None,
num_children: Optional[int] = None,
):
"""
`__init__(...)`: Initialize the Cosyne instance.
Args:
problem: The problem object to work on.
popsize: Population size, as an integer.
tournament_size: Tournament size, for tournament selection.
mutation_stdev: Standard deviation of the Gaussian mutation.
mutation_probability: Elementwise Gaussian mutation probability.
permute_all: If given as True, all solutions are subject to
permutation. If given as False (which is the default),
there will be a selection procedure for each decision
variable.
num_elites: Optionally expected as an integer, specifying the
number of elites to pass to the next generation.
Cannot be used together with the argument `elitism_ratio`.
elitism_ratio: Optionally expected as a real number between
0 and 1, specifying the amount of elites to pass to the
next generation. For example, 0.1 means that the best 10%
of the population are accepted as elites and passed onto
the next generation.
Cannot be used together with the argument `num_elites`.
eta: Optionally expected as an integer, specifying the eta
hyperparameter for the simulated binary cross-over (SBX).
If left as None, one-point cross-over will be used instead.
num_children: Number of children to generate at each iteration.
If left as None, then this number is half of the population
size.
"""
problem.ensure_numeric()
SearchAlgorithm.__init__(self, problem)
if mutation_stdev is None and mutation_probability is None:
self.mutation_op = None
else:
self.mutation_op = GaussianMutation(
self._problem, mutation_probability=float(mutation_probability), stdev=float(mutation_stdev)
)
cross_over_kwargs = {"tournament_size": tournament_size}
if num_children is None:
cross_over_kwargs["cross_over_rate"] = 2.0
else:
cross_over_kwargs["num_children"] = num_children
if eta is None:
self._cross_over_op = OnePointCrossOver(self._problem, **cross_over_kwargs)
else:
self._cross_over_op = SimulatedBinaryCrossOver(self._problem, eta=eta, **cross_over_kwargs)
self._permutation_op = CosynePermutation(self._problem, permute_all=permute_all)
self._popsize = int(popsize)
if num_elites is not None and elitism_ratio is None:
self._num_elites = int(num_elites)
elif num_elites is None and elitism_ratio is not None:
self._num_elites = int(self._popsize * elitism_ratio)
elif num_elites is None and elitism_ratio is None:
self._num_elites = None
else:
raise ValueError(
"Received both `num_elites` and `elitism_ratio`. Please provide only one of them, or none of them."
)
self._population = SolutionBatch(problem, device=problem.device, popsize=self._popsize)
self._first_generation: bool = True
# GAStatusMixin.__init__(self)
SinglePopulationAlgorithmMixin.__init__(self)
SteadyStateGA (SearchAlgorithm, SinglePopulationAlgorithmMixin)
¶
A fully elitist genetic algorithm implementation.
For multi-objective problems, the instances of this class organize their populations into pareto-fronts, and do pareto-rank-based selections among the solutions, in a compatible way with the NSGA-II algorithm.
References:
Sean Luke, 2013, Essentials of Metaheuristics, Lulu, second edition
available for free at http://cs.gmu.edu/~sean/book/metaheuristics/
Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, T. Meyarivan (2002).
A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II.
Source code in evotorch/algorithms/ga.py
class SteadyStateGA(SearchAlgorithm, SinglePopulationAlgorithmMixin):
"""
A fully elitist genetic algorithm implementation.
For multi-objective problems, the instances of this class
organize their populations into pareto-fronts, and
do pareto-rank-based selections among the solutions,
in a compatible way with the NSGA-II algorithm.
References:
Sean Luke, 2013, Essentials of Metaheuristics, Lulu, second edition
available for free at http://cs.gmu.edu/~sean/book/metaheuristics/
Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, T. Meyarivan (2002).
A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II.
"""
def __init__(self, problem: Problem, *, popsize: int, re_evaluate: bool = True):
"""
`__init__(...)`: Initialize the SteadyStateGA.
Args:
problem: The problem to optimize.
popsize: Population size.
re_evaluate: Whether or not to evaluate the solutions
that were already evaluated in the previous generations.
By default, this is set as True.
The reason behind this default setting is that,
in problems where the evaluation procedure is noisy,
by re-evaluating the already-evaluated solutions,
we prevent the bad solutions that were luckily evaluated
from hanging onto the population.
Instead, at every generation, each solution must go through
the evaluation procedure again and prove their worth.
For problems whose evaluation procedures are NOT noisy,
the user might consider turning re_evaluate to False
for saving computational cycles.
"""
SearchAlgorithm.__init__(self, problem)
self._mutation_op: Optional[Callable] = None
self._cross_over_op: Optional[Callable] = None
self._popsize = int(popsize)
self._first_iter: bool = True
self._re_eval = bool(re_evaluate)
self._population = problem.generate_batch(self._popsize)
# GAStatusMixin.__init__(self)
SinglePopulationAlgorithmMixin.__init__(self)
@property
def population(self) -> SolutionBatch:
return self._population
def use(self, operator: Callable):
"""
Use the specified operator.
If the specified operator is a CrossOver instance, then that operator
is registered as the cross-over operator. Otherwise, the operator
is registered as the mutation operator.
Args:
operator: The operator to use.
"""
if isinstance(operator, CrossOver):
self._cross_over_op = operator
else:
self._mutation_op = operator
def _step(self):
if self._first_iter or self._re_eval:
self.problem.evaluate(self._population)
self._first_iter = False
children = self._cross_over_op(self._population)
if self._mutation_op is None:
mutated = children
else:
mutated = self._mutation_op(children)
if mutated is None:
mutated = children
self.problem.evaluate(mutated)
extended = self._population.concat(mutated)
self._population = extended.take_best(self._popsize)
__init__(self, problem, *, popsize, re_evaluate=True)
special
¶
__init__(...)
: Initialize the SteadyStateGA.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem to optimize. |
required |
popsize |
int |
Population size. |
required |
re_evaluate |
bool |
Whether or not to evaluate the solutions that were already evaluated in the previous generations. By default, this is set as True. The reason behind this default setting is that, in problems where the evaluation procedure is noisy, by re-evaluating the already-evaluated solutions, we prevent the bad solutions that were luckily evaluated from hanging onto the population. Instead, at every generation, each solution must go through the evaluation procedure again and prove their worth. For problems whose evaluation procedures are NOT noisy, the user might consider turning re_evaluate to False for saving computational cycles. |
True |
Source code in evotorch/algorithms/ga.py
def __init__(self, problem: Problem, *, popsize: int, re_evaluate: bool = True):
"""
`__init__(...)`: Initialize the SteadyStateGA.
Args:
problem: The problem to optimize.
popsize: Population size.
re_evaluate: Whether or not to evaluate the solutions
that were already evaluated in the previous generations.
By default, this is set as True.
The reason behind this default setting is that,
in problems where the evaluation procedure is noisy,
by re-evaluating the already-evaluated solutions,
we prevent the bad solutions that were luckily evaluated
from hanging onto the population.
Instead, at every generation, each solution must go through
the evaluation procedure again and prove their worth.
For problems whose evaluation procedures are NOT noisy,
the user might consider turning re_evaluate to False
for saving computational cycles.
"""
SearchAlgorithm.__init__(self, problem)
self._mutation_op: Optional[Callable] = None
self._cross_over_op: Optional[Callable] = None
self._popsize = int(popsize)
self._first_iter: bool = True
self._re_eval = bool(re_evaluate)
self._population = problem.generate_batch(self._popsize)
# GAStatusMixin.__init__(self)
SinglePopulationAlgorithmMixin.__init__(self)
use(self, operator)
¶
Use the specified operator.
If the specified operator is a CrossOver instance, then that operator is registered as the cross-over operator. Otherwise, the operator is registered as the mutation operator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
operator |
Callable |
The operator to use. |
required |
Source code in evotorch/algorithms/ga.py
def use(self, operator: Callable):
"""
Use the specified operator.
If the specified operator is a CrossOver instance, then that operator
is registered as the cross-over operator. Otherwise, the operator
is registered as the mutation operator.
Args:
operator: The operator to use.
"""
if isinstance(operator, CrossOver):
self._cross_over_op = operator
else:
self._mutation_op = operator
searchalgorithm
¶
This namespace contains SearchAlgorithm
, the base class for all
evolutionary algorithms.
LazyReporter
¶
This class provides an interface of storing and reporting status. This class is designed to be inherited by other classes.
Let us assume that we have the following class inheriting from LazyReporter:
class Example(LazyReporter):
def __init__(self):
LazyReporter.__init__(self, a=self._get_a, b=self._get_b)
def _get_a(self):
return ... # return the status 'a'
def _get_b(self):
return ... # return the status 'b'
At its initialization phase, this Example class registers its methods
_get_a
and _get_b
as its status providers.
Having the LazyReporter interface, the Example class gains a status
property:
ex = Example()
print(ex.status["a"]) # Get the status 'a'
print(ex.status["b"]) # Get the status 'b'
Once a status is queried, its computation result is stored to be re-used later. After running the code above, if we query the status 'a' again:
then the status 'a' is not computed again (i.e. _get_a
is not
called again). Instead, the stored status value of 'a' is re-used.
To force re-computation of the status values, one can execute:
Or the Example instance can clear its status from within one of its methods:
Source code in evotorch/algorithms/searchalgorithm.py
class LazyReporter:
"""
This class provides an interface of storing and reporting status.
This class is designed to be inherited by other classes.
Let us assume that we have the following class inheriting from
LazyReporter:
```python
class Example(LazyReporter):
def __init__(self):
LazyReporter.__init__(self, a=self._get_a, b=self._get_b)
def _get_a(self):
return ... # return the status 'a'
def _get_b(self):
return ... # return the status 'b'
```
At its initialization phase, this Example class registers its methods
``_get_a`` and ``_get_b`` as its status providers.
Having the LazyReporter interface, the Example class gains a ``status``
property:
```python
ex = Example()
print(ex.status["a"]) # Get the status 'a'
print(ex.status["b"]) # Get the status 'b'
```
Once a status is queried, its computation result is stored to be re-used
later. After running the code above, if we query the status 'a' again:
```python
print(ex.status["a"]) # Getting the status 'a' again
```
then the status 'a' is not computed again (i.e. ``_get_a`` is not
called again). Instead, the stored status value of 'a' is re-used.
To force re-computation of the status values, one can execute:
```python
ex.clear_status()
```
Or the Example instance can clear its status from within one of its
methods:
```python
class Example(LazyReporter):
...
def some_method(self):
...
self.clear_status()
```
"""
@staticmethod
def _missing_status_producer():
return None
def __init__(self, **kwargs):
"""
`__init__(...)`: Initialize the LazyReporter instance.
Args:
kwargs: Keyword arguments, mapping the status keys to the
methods or functions providing the status values.
"""
self.__getters = kwargs
self.__computed = {}
def get_status_value(self, key: Any) -> Any:
"""
Get the specified status value.
Args:
key: The key (i.e. the name) of the status variable.
"""
if key not in self.__computed:
self.__computed[key] = self.__getters[key]()
return self.__computed[key]
def has_status_key(self, key: Any) -> bool:
"""
Return True if there is a status variable with the specified key.
Otherwise, return False.
Args:
key: The key (i.e. the name) of the status variable whose
existence is to be checked.
Returns:
True if there is such a key; False otherwise.
"""
return key in self.__getters
def iter_status_keys(self):
"""Iterate over the status keys."""
return self.__getters.keys()
def clear_status(self):
"""Clear all the stored values of the status variables."""
self.__computed.clear()
def is_status_computed(self, key) -> bool:
"""
Return True if the specified status is computed yet.
Return False otherwise.
Args:
key: The key (i.e. the name) of the status variable.
Returns:
True if the status of the given key is computed; False otherwise.
"""
return key in self.__computed
def update_status(self, additional_status: Mapping):
"""
Update the stored status with an external dict-like object.
The given dict-like object can override existing status keys
with new values, and also bring new keys to the status.
Args:
additional_status: A dict-like object storing the status update.
"""
for k, v in additional_status.items():
if k not in self.__getters:
self.__getters[k] = LazyReporter._missing_status_producer
self.__computed[k] = v
def add_status_getters(self, getters: Mapping):
"""
Register additional status-getting functions.
Args:
getters: A dictionary-like object where the keys are the
additional status variable names, and values are functions
which are expected to compute/retrieve the values for those
status variables.
"""
self.__getters.update(getters)
@property
def status(self) -> "LazyStatusDict":
"""Get a LazyStatusDict which is bound to this LazyReporter."""
return LazyStatusDict(self)
status: LazyStatusDict
property
readonly
¶
Get a LazyStatusDict which is bound to this LazyReporter.
__init__(self, **kwargs)
special
¶
__init__(...)
: Initialize the LazyReporter instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
Keyword arguments, mapping the status keys to the methods or functions providing the status values. |
{} |
Source code in evotorch/algorithms/searchalgorithm.py
add_status_getters(self, getters)
¶
Register additional status-getting functions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
getters |
Mapping |
A dictionary-like object where the keys are the additional status variable names, and values are functions which are expected to compute/retrieve the values for those status variables. |
required |
Source code in evotorch/algorithms/searchalgorithm.py
def add_status_getters(self, getters: Mapping):
"""
Register additional status-getting functions.
Args:
getters: A dictionary-like object where the keys are the
additional status variable names, and values are functions
which are expected to compute/retrieve the values for those
status variables.
"""
self.__getters.update(getters)
clear_status(self)
¶
get_status_value(self, key)
¶
Get the specified status value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
Any |
The key (i.e. the name) of the status variable. |
required |
Source code in evotorch/algorithms/searchalgorithm.py
has_status_key(self, key)
¶
Return True if there is a status variable with the specified key. Otherwise, return False.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
Any |
The key (i.e. the name) of the status variable whose existence is to be checked. |
required |
Returns:
Type | Description |
---|---|
bool |
True if there is such a key; False otherwise. |
Source code in evotorch/algorithms/searchalgorithm.py
def has_status_key(self, key: Any) -> bool:
"""
Return True if there is a status variable with the specified key.
Otherwise, return False.
Args:
key: The key (i.e. the name) of the status variable whose
existence is to be checked.
Returns:
True if there is such a key; False otherwise.
"""
return key in self.__getters
is_status_computed(self, key)
¶
Return True if the specified status is computed yet. Return False otherwise.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
The key (i.e. the name) of the status variable. |
required |
Returns:
Type | Description |
---|---|
bool |
True if the status of the given key is computed; False otherwise. |
Source code in evotorch/algorithms/searchalgorithm.py
iter_status_keys(self)
¶
update_status(self, additional_status)
¶
Update the stored status with an external dict-like object. The given dict-like object can override existing status keys with new values, and also bring new keys to the status.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
additional_status |
Mapping |
A dict-like object storing the status update. |
required |
Source code in evotorch/algorithms/searchalgorithm.py
def update_status(self, additional_status: Mapping):
"""
Update the stored status with an external dict-like object.
The given dict-like object can override existing status keys
with new values, and also bring new keys to the status.
Args:
additional_status: A dict-like object storing the status update.
"""
for k, v in additional_status.items():
if k not in self.__getters:
self.__getters[k] = LazyReporter._missing_status_producer
self.__computed[k] = v
LazyStatusDict (Mapping)
¶
A Mapping subclass used by the status
property of a LazyReporter
.
The interface of this object is similar to a read-only dictionary.
Source code in evotorch/algorithms/searchalgorithm.py
class LazyStatusDict(Mapping):
"""
A Mapping subclass used by the `status` property of a `LazyReporter`.
The interface of this object is similar to a read-only dictionary.
"""
def __init__(self, lazy_reporter: LazyReporter):
"""
`__init__(...)`: Initialize the LazyStatusDict object.
Args:
lazy_reporter: The LazyReporter object whose status is to be
accessed.
"""
super().__init__()
self.__lazy_reporter = lazy_reporter
def __getitem__(self, key: Any) -> Any:
result = self.__lazy_reporter.get_status_value(key)
if isinstance(result, (torch.Tensor, ObjectArray)):
result = as_read_only_tensor(result)
return result
def __len__(self) -> int:
return len(list(self.__lazy_reporter.iter_status_keys()))
def __iter__(self):
for k in self.__lazy_reporter.iter_status_keys():
yield k
def __contains__(self, key: Any) -> bool:
return self.__lazy_reporter.has_status_key(key)
def _to_string(self) -> str:
with io.StringIO() as f:
print("<" + type(self).__name__, file=f)
for k in self.__lazy_reporter.iter_status_keys():
if self.__lazy_reporter.is_status_computed(k):
r = repr(self.__lazy_reporter.get_status_value(k))
else:
r = "<not yet computed>"
print(" ", k, "=", r, file=f)
print(">", end="", file=f)
f.seek(0)
entire_str = f.read()
return entire_str
def __str__(self) -> str:
return self._to_string()
def __repr__(self) -> str:
return self._to_string()
__init__(self, lazy_reporter)
special
¶
__init__(...)
: Initialize the LazyStatusDict object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lazy_reporter |
LazyReporter |
The LazyReporter object whose status is to be accessed. |
required |
SearchAlgorithm (LazyReporter)
¶
Base class for all evolutionary search algorithms.
An algorithm developer is expected to inherit from this base class,
and override the method named _step()
to define how a single
step of this new algorithm is performed.
For each core status dictionary element, a new method is expected
to exist within the inheriting class. These status reporting
methods are then registered via the keyword arguments of the
__init__(...)
method of SearchAlgorithm
.
To sum up, a newly developed algorithm inheriting from this base class is expected in this structure:
from evotorch import Problem
class MyNewAlgorithm(SearchAlgorithm):
def __init__(self, problem: Problem):
SearchAlgorithm.__init__(
self, problem, status1=self._get_status1, status2=self._get_status2, ...
)
def _step(self):
# Code that defines how a step of this algorithm
# should work goes here.
...
def _get_status1(self):
# The value returned by this function will be shown
# in the status dictionary, associated with the key
# 'status1'.
return ...
def _get_status2(self):
# The value returned by this function will be shown
# in the status dictionary, associated with the key
# 'status2'.
return ...
Source code in evotorch/algorithms/searchalgorithm.py
class SearchAlgorithm(LazyReporter):
"""
Base class for all evolutionary search algorithms.
An algorithm developer is expected to inherit from this base class,
and override the method named `_step()` to define how a single
step of this new algorithm is performed.
For each core status dictionary element, a new method is expected
to exist within the inheriting class. These status reporting
methods are then registered via the keyword arguments of the
`__init__(...)` method of `SearchAlgorithm`.
To sum up, a newly developed algorithm inheriting from this base
class is expected in this structure:
```python
from evotorch import Problem
class MyNewAlgorithm(SearchAlgorithm):
def __init__(self, problem: Problem):
SearchAlgorithm.__init__(
self, problem, status1=self._get_status1, status2=self._get_status2, ...
)
def _step(self):
# Code that defines how a step of this algorithm
# should work goes here.
...
def _get_status1(self):
# The value returned by this function will be shown
# in the status dictionary, associated with the key
# 'status1'.
return ...
def _get_status2(self):
# The value returned by this function will be shown
# in the status dictionary, associated with the key
# 'status2'.
return ...
```
"""
def __init__(self, problem: Problem, **kwargs):
"""
Initialize the SearchAlgorithm instance.
Args:
problem: Problem to work with.
kwargs: Any additional keyword argument, in the form of `k=f`,
is accepted in this manner: for each pair of `k` and `f`,
`k` is accepted as the status key (i.e. a status variable
name), and `f` is accepted as a function (probably a method
of the inheriting class) that will generate the value of that
status variable.
"""
super().__init__(**kwargs)
self._problem = problem
self._before_step_hook = Hook()
self._after_step_hook = Hook()
self._log_hook = Hook()
self._steps_count: int = 0
@property
def problem(self) -> Problem:
"""
The problem object which is being worked on.
"""
return self._problem
@property
def before_step_hook(self) -> Hook:
"""
Use this Hook to add more behavior to the search algorithm
to be performed just before executing a step.
"""
return self._before_step_hook
@property
def after_step_hook(self) -> Hook:
"""
Use this Hook to add more behavior to the search algorithm
to be performed just after executing a step.
The dictionaries returned by the functions registered into
this Hook will be accumulated and added into the status
dictionary of the search algorithm.
"""
return self._after_step_hook
@property
def log_hook(self) -> Hook:
"""
Use this Hook to add more behavior to the search algorithm
at the moment of logging the constructed status dictionary.
This Hook is executed after the execution of `after_step_hook`
is complete.
The functions in this Hook are assumed to expect a single
argument, that is the status dictionary of the search algorithm.
"""
return self._log_hook
@property
def steps_count(self) -> int:
"""
Number of search steps performed.
This is equivalent to the number of generations, or to the
number of iterations.
"""
return self._steps_count
def step(self):
"""
Perform a step of the search algorithm.
"""
self._before_step_hook()
self.clear_status()
self._step()
self._steps_count += 1
self.update_status({"iter": self._steps_count})
self.update_status(self._problem.status)
extra_status = self._after_step_hook.accumulate_dict()
self.update_status(extra_status)
if len(self._log_hook) >= 1:
self._log_hook(dict(self.status))
def _step(self):
"""
Algorithm developers are expected to override this method
in an inheriting subclass.
The code which defines how a step of the evolutionary algorithm
is performed goes here.
"""
raise NotImplementedError
def run(self, num_generations: int):
"""
Run the algorithm for the given number of generations
(i.e. iterations).
Args:
num_generations: Number of generations.
"""
for _ in range(int(num_generations)):
self.step()
after_step_hook: Hook
property
readonly
¶
Use this Hook to add more behavior to the search algorithm to be performed just after executing a step.
The dictionaries returned by the functions registered into this Hook will be accumulated and added into the status dictionary of the search algorithm.
before_step_hook: Hook
property
readonly
¶
Use this Hook to add more behavior to the search algorithm to be performed just before executing a step.
log_hook: Hook
property
readonly
¶
Use this Hook to add more behavior to the search algorithm at the moment of logging the constructed status dictionary.
This Hook is executed after the execution of after_step_hook
is complete.
The functions in this Hook are assumed to expect a single argument, that is the status dictionary of the search algorithm.
problem: Problem
property
readonly
¶
The problem object which is being worked on.
steps_count: int
property
readonly
¶
Number of search steps performed.
This is equivalent to the number of generations, or to the number of iterations.
__init__(self, problem, **kwargs)
special
¶
Initialize the SearchAlgorithm instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
Problem to work with. |
required |
kwargs |
Any additional keyword argument, in the form of |
{} |
Source code in evotorch/algorithms/searchalgorithm.py
def __init__(self, problem: Problem, **kwargs):
"""
Initialize the SearchAlgorithm instance.
Args:
problem: Problem to work with.
kwargs: Any additional keyword argument, in the form of `k=f`,
is accepted in this manner: for each pair of `k` and `f`,
`k` is accepted as the status key (i.e. a status variable
name), and `f` is accepted as a function (probably a method
of the inheriting class) that will generate the value of that
status variable.
"""
super().__init__(**kwargs)
self._problem = problem
self._before_step_hook = Hook()
self._after_step_hook = Hook()
self._log_hook = Hook()
self._steps_count: int = 0
run(self, num_generations)
¶
Run the algorithm for the given number of generations (i.e. iterations).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_generations |
int |
Number of generations. |
required |
step(self)
¶
Perform a step of the search algorithm.
Source code in evotorch/algorithms/searchalgorithm.py
def step(self):
"""
Perform a step of the search algorithm.
"""
self._before_step_hook()
self.clear_status()
self._step()
self._steps_count += 1
self.update_status({"iter": self._steps_count})
self.update_status(self._problem.status)
extra_status = self._after_step_hook.accumulate_dict()
self.update_status(extra_status)
if len(self._log_hook) >= 1:
self._log_hook(dict(self.status))
SinglePopulationAlgorithmMixin
¶
A mixin class that can be inherited by a SearchAlgorithm subclass.
This mixin class assumes that the inheriting class has the following members:
problem
: The problem object that is associated with the search algorithm. This attribute is already provided by the SearchAlgorithm base class.population
: An attribute or a (possibly read-only) property which stores the population of the search algorithm as aSolutionBatch
instance.
This mixin class also assumes that the inheriting class might
contain an attribute (or a property) named obj_index
.
If there is such an attribute and its value is not None, then this
mixin class assumes that obj_index
represents the index of the
objective that is being focused on.
Upon initialization, this mixin class first determines whether or not
the algorithm is a single-objective one.
In more details, if there is an attribute named obj_index
(and its
value is not None), or if the associated problem has only one objective,
then this mixin class assumes that the inheriting SearchAlgorithm is a
single objective algorithm.
Otherwise, it is assumed that the underlying algorithm works (or might
work) on multiple objectives.
In the single-objective case, this mixin class brings the inheriting
SearchAlgorithm the ability to report the following:
pop_best
(best solution of the population),
pop_best_eval
(evaluation result of the population's best solution),
mean_eval
(mean evaluation result of the population),
median_eval
(median evaluation result of the population).
In the multi-objective case, for each objective i
, this mixin class
brings the inheriting SearchAlgorithm the ability to report the following:
obj<i>_pop_best
(best solution of the population according),
obj<i>_pop_best_eval
(evaluation result of the population's best
solution),
obj<i>_mean_eval
(mean evaluation result of the population)
obj<iP_median_eval
(median evaluation result of the population).
Source code in evotorch/algorithms/searchalgorithm.py
class SinglePopulationAlgorithmMixin:
"""
A mixin class that can be inherited by a SearchAlgorithm subclass.
This mixin class assumes that the inheriting class has the following
members:
- `problem`: The problem object that is associated with the search
algorithm. This attribute is already provided by the SearchAlgorithm
base class.
- `population`: An attribute or a (possibly read-only) property which
stores the population of the search algorithm as a `SolutionBatch`
instance.
This mixin class also assumes that the inheriting class _might_
contain an attribute (or a property) named `obj_index`.
If there is such an attribute and its value is not None, then this
mixin class assumes that `obj_index` represents the index of the
objective that is being focused on.
Upon initialization, this mixin class first determines whether or not
the algorithm is a single-objective one.
In more details, if there is an attribute named `obj_index` (and its
value is not None), or if the associated problem has only one objective,
then this mixin class assumes that the inheriting SearchAlgorithm is a
single objective algorithm.
Otherwise, it is assumed that the underlying algorithm works (or might
work) on multiple objectives.
In the single-objective case, this mixin class brings the inheriting
SearchAlgorithm the ability to report the following:
`pop_best` (best solution of the population),
`pop_best_eval` (evaluation result of the population's best solution),
`mean_eval` (mean evaluation result of the population),
`median_eval` (median evaluation result of the population).
In the multi-objective case, for each objective `i`, this mixin class
brings the inheriting SearchAlgorithm the ability to report the following:
`obj<i>_pop_best` (best solution of the population according),
`obj<i>_pop_best_eval` (evaluation result of the population's best
solution),
`obj<i>_mean_eval` (mean evaluation result of the population)
`obj<iP_median_eval` (median evaluation result of the population).
"""
class ObjectiveStatusReporter:
REPORTABLES = {"pop_best", "pop_best_eval", "mean_eval", "median_eval"}
def __init__(
self,
algorithm: SearchAlgorithm,
*,
obj_index: int,
to_report: str,
):
self.__algorithm = algorithm
self.__obj_index = int(obj_index)
if to_report not in self.REPORTABLES:
raise ValueError(f"Unrecognized report request: {to_report}")
self.__to_report = to_report
@property
def population(self) -> SolutionBatch:
return self.__algorithm.population
@property
def obj_index(self) -> int:
return self.__obj_index
def get_status_value(self, status_key: str) -> Any:
return self.__algorithm.get_status_value(status_key)
def has_status_key(self, status_key: str) -> bool:
return self.__algorithm.has_status_key(status_key)
def _get_pop_best(self):
i = self.population.argbest(self.obj_index)
return clone(self.population[i])
def _get_pop_best_eval(self):
pop_best = None
pop_best_keys = ("pop_best", f"obj{self.obj_index}_pop_best")
for pop_best_key in pop_best_keys:
if self.has_status_key(pop_best_key):
pop_best = self.get_status_value(pop_best_key)
break
if (pop_best is not None) and pop_best.is_evaluated:
return float(pop_best.evals[self.obj_index])
else:
return None
@torch.no_grad()
def _get_mean_eval(self):
return float(torch.mean(self.population.access_evals(self.obj_index)))
@torch.no_grad()
def _get_median_eval(self):
return float(torch.median(self.population.access_evals(self.obj_index)))
def __call__(self):
return getattr(self, "_get_" + self.__to_report)()
def __init__(self, *, exclude: Optional[Iterable] = None, enable: bool = True):
if not enable:
return
ObjectiveStatusReporter = self.ObjectiveStatusReporter
reportables = ObjectiveStatusReporter.REPORTABLES
single_obj: Optional[int] = None
self.__exclude = set() if exclude is None else set(exclude)
if hasattr(self, "obj_index") and (self.obj_index is not None):
single_obj = self.obj_index
elif len(self.problem.senses) == 1:
single_obj = 0
if single_obj is not None:
for reportable in reportables:
if reportable not in self.__exclude:
self.add_status_getters(
{reportable: ObjectiveStatusReporter(self, obj_index=single_obj, to_report=reportable)}
)
else:
for i_obj in range(len(self.problem.senses)):
for reportable in reportables:
if reportable not in self.__exclude:
self.add_status_getters(
{
f"obj{i_obj}_{reportable}": ObjectiveStatusReporter(
self, obj_index=i_obj, to_report=reportable
),
}
)
core
¶
Definitions of the core classes: Problem, Solution, and SolutionBatch.
ActorSeeds (tuple)
¶
AllRemoteEnvs
¶
Representation of all remote reinforcement learning instances stored by the ray actors.
An instance of this class is to be obtained from a main (i.e. non-remote) Problem object, as follows:
remote_envs = my_problem.all_remote_envs()
A remote method f() on all remote environments can then be executed as follows:
results = remote_envs.f()
Given that there are n
actors, results
contains n
objects,
the i-th object being the method's result from the i-th actor.
An alternative to the example above is like this:
results = my_problem.all_remote_envs().f()
Source code in evotorch/core.py
class AllRemoteEnvs:
"""
Representation of all remote reinforcement learning instances
stored by the ray actors.
An instance of this class is to be obtained from a main
(i.e. non-remote) Problem object, as follows:
remote_envs = my_problem.all_remote_envs()
A remote method f() on all remote environments can then
be executed as follows:
results = remote_envs.f()
Given that there are `n` actors, `results` contains `n` objects,
the i-th object being the method's result from the i-th actor.
An alternative to the example above is like this:
results = my_problem.all_remote_envs().f()
"""
def __init__(self, actors: list):
self._actors = actors
def __getattr__(self, attr_name: str) -> Any:
return RemoteMethod(attr_name, self._actors, on_env=True)
AllRemoteProblems
¶
Representation of all remote problem instances stored by the ray actors.
An instance of this class is to be obtained from a main (i.e. non-remote) Problem object, as follows:
remote_probs = my_problem.all_remote_problems()
A remote method f() on all remote Problem instances can then be executed as follows:
results = remote_probs.f()
Given that there are n
actors, results
contains n
objects,
the i-th object being the method's result from the i-th actor.
An alternative to the example above is like this:
results = my_problem.all_remote_problems().f()
Source code in evotorch/core.py
class AllRemoteProblems:
"""
Representation of all remote problem instances stored by the ray actors.
An instance of this class is to be obtained from a main
(i.e. non-remote) Problem object, as follows:
remote_probs = my_problem.all_remote_problems()
A remote method f() on all remote Problem instances can then
be executed as follows:
results = remote_probs.f()
Given that there are `n` actors, `results` contains `n` objects,
the i-th object being the method's result from the i-th actor.
An alternative to the example above is like this:
results = my_problem.all_remote_problems().f()
"""
def __init__(self, actors: list):
self._actors = actors
def __getattr__(self, attr_name: str) -> Any:
return RemoteMethod(attr_name, self._actors)
BoundsPair (tuple)
¶
ParetoInfo (tuple)
¶
Problem (TensorMakerMixin)
¶
Representation of a problem to be optimized.
A problem can be defined via inheritance.
For example, let us consider a problem of minimizing the L2 norm of a vector of length 10.
This problem can be defined as follows:
from evotorch import Problem, SolutionBatch
class MinNorm(Problem):
def __init__(self):
# Main characteristics of the problem are specified
# with the help of the `__init__` method of the Problem
# class
super().__init__(
# This is a minimization problem
objective_sense="min",
# Length of a solution is 10
solution_length=10,
# Each element of a new solution is to be initialized
# between -10.0 and 10.0
initial_bounds=(-10.0, 10.0)
)
def _evaluate_batch(batch: SolutionBatch):
# We override _evaluate_batch(...) to define how a solution
# is evaluated.
# Get the decision values as a ReadOnlyTensor
values = batch.values
# Compute the costs of the solutions, which, in the case of
# this example, is the L2 norm
costs = torch.linalg.norm(values, dim=-1)
# Register the computed costs as the evaluation results of
# the solutions
batch.set_evals(costs)
Source code in evotorch/core.py
class Problem(TensorMakerMixin):
"""
Representation of a problem to be optimized.
A problem can be defined via inheritance.
For example, let us consider a problem of minimizing the L2 norm
of a vector of length 10.
This problem can be defined as follows:
from evotorch import Problem, SolutionBatch
class MinNorm(Problem):
def __init__(self):
# Main characteristics of the problem are specified
# with the help of the `__init__` method of the Problem
# class
super().__init__(
# This is a minimization problem
objective_sense="min",
# Length of a solution is 10
solution_length=10,
# Each element of a new solution is to be initialized
# between -10.0 and 10.0
initial_bounds=(-10.0, 10.0)
)
def _evaluate_batch(batch: SolutionBatch):
# We override _evaluate_batch(...) to define how a solution
# is evaluated.
# Get the decision values as a ReadOnlyTensor
values = batch.values
# Compute the costs of the solutions, which, in the case of
# this example, is the L2 norm
costs = torch.linalg.norm(values, dim=-1)
# Register the computed costs as the evaluation results of
# the solutions
batch.set_evals(costs)
"""
def __init__(
self,
objective_sense: ObjectiveSense,
objective_func: Optional[Callable] = None,
*,
initial_bounds: Optional[BoundsPairLike] = None,
bounds: Optional[BoundsPairLike] = None,
solution_length: Optional[int] = None,
dtype: Optional[DType] = None,
eval_dtype: Optional[DType] = None,
device: Optional[Device] = None,
eval_data_length: Optional[int] = None,
seed: Optional[int] = None,
num_actors: Optional[Union[int, str]] = None,
actor_config: Optional[dict] = None,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
store_solution_stats: Optional[bool] = None,
vectorized: bool = False,
):
"""
`__init__(...)`: Initialize the Problem object.
Args:
objective_sense: A string, or a sequence of strings.
For a single-objective problem, a single string
("min" or "max", for minimization or maximization)
is enough.
For a problem with `n` objectives, a sequence
of strings, of length `n`, is required, each string
in the sequence being "min" or "max".
This argument specifies the goal of the optimization.
initial_bounds: In which interval will the values of a
new solution will be initialized.
Expected as a tuple, each element being either a
scalar, or a vector of length `n`, `n` being the
length of a solution.
If a manual solution initialization is preferred
(instead of an interval-based initialization),
one can leave `initial_bounds` as None, and override
the `generate_values(...)` method instead in the
inheriting subclass.
bounds: Interval in which all the solutions must always
reside.
Expected as a tuple, each element being either a
scalar, or a vector of length `n`, `n` being the
length of a solution.
This argument is optional, and can be left as None
if one does not wish to declare hard bounds on the
decision values of the problem.
If `bounds` is specified, `initial_bounds` is missing,
and `generate_values(...)` is not overriden, then
`bounds` will also serve as the `initial_bounds`.
solution_length: Length of a solution.
Required for all fixed-length numeric optimization
problems.
For variable-length problems (which might or might not
be numeric), one is expected to leave `solution_length`
as None, and declare `dtype` as `object`.
dtype: dtype (data type) of the data stored by a solution.
Can be given as a string (e.g. "float32"),
or as a numpy dtype (e.g. `numpy.dtype("float32")`),
or as a PyTorch dtype (e.g. `torch.float32`).
Alternatively, if the problem is variable-length
and/or non-numeric, one is expected to declare `dtype`
as `object`.
eval_dtype: dtype to be used for storing the evaluations
(or fitnesses, or scores, or costs, or losses)
of the solutions.
Can be given as a string (e.g. "float32"),
or as a numpy dtype (e.g. `numpy.dtype("float32")`),
or as a PyTorch dtype (e.g. `torch.float32`).
`eval_dtype` must always refer to a "float" data type,
therefore, `object` is not accepted as a valid `eval_dtype`.
If `eval_dtype` is not specified (i.e. left as None),
then the following actions are taken to determine the
`eval_dtype`:
if `dtype` is "float16", `eval_dtype` becomes "float16";
if `dtype` is "bfloat16", `eval_dtype` becomes "bfloat16";
if `dtype` is "float32", `eval_dtype` becomes "float32";
if `dtype` is "float64", `eval_dtype` becomes "float64";
and for any other `dtype`, `eval_dtype` becomes "float32".
device: Default device in which a new population will be
generated. For non-numeric problems, this must be "cpu".
For numeric problems, this can be any device supported
by PyTorch (e.g. "cuda").
eval_data_length: In addition to evaluation results
(which are (un)fitnesses, or scores, or costs, or losses),
each solution can store extra evaluation data.
If storage of such extra evaluation data is required,
one can set this argument to an integer bigger than 0.
seed: Random seed to be used by the random number generator
attached to the problem object.
If left as None, no random number generator will be
attached, and the global random number generator of
PyTorch will be used instead.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
There is also an option, "num_devices", which means that
both the numbers of CPUs and GPUs will be analyzed, and
new actors and GPUs for them will be allocated,
in a one-to-one mapping manner, if possible.
In more details, with `num_actors="num_devices"`, if
`device` is given as a GPU device, then it will be inferred
that the user wishes to put everything (including the
population) on a single GPU, and therefore there won't be
any allocation of actors nor GPUs.
With `num_actors="num_devices"` and with `device` set as
"cpu" (or as left as None), if there are multiple CPUs
and multiple GPUs, then `n` actors will be allocated
where `n` is the minimum among the number of CPUs
and the number of GPUs, so that there can be one-to-one
mapping between CPUs and GPUs (i.e. such that each actor
can be assigned an entire GPU).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many sub-batches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a sub-batch (or sub-population) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
store_solution_stats: Whether or not the problem object should
keep track of the best and worst solutions.
Can also be left as None (which is the default behavior),
in which case, it will store the best and worst solutions
only when the first solution batch it encounters is on the
cpu. This default behavior is to ensure that there is no
transfer between the cpu and a foreign computation device
(like the gpu) just for the sake of keeping the best and
the worst solutions.
"""
# Set the dtype for the decision variables of the Problem
if dtype is None:
self._dtype = torch.float32
elif is_dtype_object(dtype):
self._dtype = object
else:
self._dtype = to_torch_dtype(dtype)
# Set the dtype for the solution evaluations (i.e. fitnesses and evaluation data)
if eval_dtype is not None:
# If an `eval_dtype` is explicitly stated, then accept it as the `_eval_dtype` of the Problem
self._eval_dtype = to_torch_dtype(eval_dtype)
else:
# This is the case where an `eval_dtype` is not explicitly stated by the user.
# We need to choose a default.
if self._dtype in (torch.float16, torch.bfloat16, torch.float64):
# If the `dtype` of the problem is a non-32-bit float type (i.e. float16, bfloat16, float64)
# then we use that as our `_eval_dtype` as well.
self._eval_dtype = self._dtype
else:
# For any other `dtype`, we use float32 as our `_eval_dtype`.
self._eval_dtype = torch.float32
# Set the main device of the Problem object
self._device = torch.device("cpu") if device is None else torch.device(device)
# Declare the internal variable that might store the random number generator
self._generator: Optional[torch.Generator] = None
# Set the seed of the Problem object, if a seed is provided
self.manual_seed(seed)
# Declare the internal variables that will store the bounds and the solution length
self._initial_lower_bounds: Optional[torch.Tensor] = None
self._initial_upper_bounds: Optional[torch.Tensor] = None
self._lower_bounds: Optional[torch.Tensor] = None
self._upper_bounds: Optional[torch.Tensor] = None
self._solution_length: Optional[int] = None
if self._dtype is object:
# If dtype is given as `object`, then there are some runtime sanity checks to perform
if bounds is not None or initial_bounds is not None:
# With dtype as object, if bounds are given then we raise an error.
# This is because the `object` dtype implies that the decision values are not necessarily numeric,
# and therefore, we cannot have the guarantee of satisfying numeric bounds.
raise ValueError(
f"With dtype as {repr(dtype)}, expected to receive `initial_bounds` and/or `bounds` as None."
f" However, one or both of them is/are set as value(s) other than None."
)
if solution_length is not None:
# With dtype as object, if `solution_length` is provided, then we raise an error.
# This is because the `object` dtype implies that the solutions can be expressed via various
# containers, each with its own length, and therefore, a fixed solution length cannot be guaranteed.
raise ValueError(
f"With dtype as {repr(dtype)}, expected to receive `solution_length` as None."
f" However, received `solution_length` as {repr(solution_length)}."
)
if str(self._device) != "cpu":
# With dtype as object, if `device` is something other than "cpu", then we raise an error.
# This is because the `object` dtype implies that the decision values are stored by an ObjectArray,
# whose device is always "cpu".
raise ValueError(
f"With dtype as {repr(dtype)}, expected to receive `device` as 'cpu'."
f" However, received `device` as {repr(device)}."
)
else:
# If dtype is something other than `object`, then we need to properly store the given numeric bounds,
# and also perform some sanity checks.
initbnd_tuple_name = "initial_bounds"
bnd_tuple_name = "bounds"
if (bounds is None) and (initial_bounds is None):
# With a numeric dtype, if no boundary is provided at all, then we cannot know how to initialize
# the solutions. With such a lack of information, we raise an error.
raise ValueError(
f"Together with a numeric dtype ({repr(dtype)}),"
f" expected to receive `initial_bounds` and/or `bounds` as something other than None."
f" However, both `initial_bounds` and `bounds` are None."
)
elif (bounds is not None) and (initial_bounds is None):
# With a numeric dtype, if strict bounds are given but initial bounds are not given, then we assume
# that the strict bounds also serve as the initial bounds.
# Therefore, we take clones of the strict bounds and use this clones as the initial bounds.
initial_bounds = clone(bounds)
initbnd_tuple_name = "bounds"
if solution_length is None:
# With a numeric dtype, if solution length is missing, then we raise an error.
raise ValueError(
f"Together with a numeric dtype ({repr(dtype)}),"
f" expected to receive `solution_length` as an integer."
f" However, `solution_length` is None."
)
else:
# With a numeric dtype, if a solution length is provided, we make sure that it is integer.
solution_length = int(solution_length)
# Store the solution length
self._solution_length = solution_length
# Below is an internal helper function for some common operations for the (strict) bounds
# and for the initial bounds.
def process_bounds(bounds_tuple: BoundsPairLike, tuple_name: str) -> BoundsPair:
# This function receives the bounds_tuple (a tuple containing lower and upper bounds),
# and the string name of the bounds argument ("bounds" or "initial_bounds").
# What is returned is the bounds expressed as PyTorch tensors in the correct dtype and device.
nonlocal solution_length
# Extract the lower and upper bounds from the received bounds tuple.
lb, ub = bounds_tuple
# Make sure that the lower and upper bounds are expressed as tensors of correct dtype and device.
lb = self.make_tensor(lb)
ub = self.make_tensor(ub)
for bound_array in (lb, ub): # For each boundary tensor (lb and ub)
if bound_array.ndim not in (0, 1):
# If the boundary tensor is not as scalar and is not a 1-dimensional vector, then raise an
# error.
raise ValueError(
f"Lower and upper bounds are expected as scalars or as 1-dimensional vectors."
f" However, these given boundaries have incompatible shape:"
f" {bound_array} (of shape {bound_array.shape})."
)
if bound_array.ndim == 1:
if len(bound_array) != solution_length:
# In the case where the boundary tensor is a 1-dimensional vector, if this vector's length
# is not equal to the solution length, then we raise an error.
raise ValueError(
f"When boundaries are expressed as 1-dimensional vectors, their length are"
f" expected as the solution length of the Problem object."
f" However, while the problem's solution length is {solution_length},"
f" these given boundaries have incompatible length:"
f" {bound_array} (of length {len(bound_array)})."
)
# Return the processed forms of the lower and upper boundary tensors.
return lb, ub
# Process the initial bounds with the help of the internal function `process_bounds(...)`
init_lb, init_ub = process_bounds(initial_bounds, initbnd_tuple_name)
# Store the processed initial bounds
self._initial_lower_bounds = init_lb
self._initial_upper_bounds = init_ub
if bounds is not None:
# If there are strict bounds, then process those bounds with the help of `process_bounds(...)`.
lb, ub = process_bounds(bounds, bnd_tuple_name)
# Store the processed bounds
self._lower_bounds = lb
self._upper_bounds = ub
# Annotate the variable that will store the objective sense(s) of the problem
self._objective_sense: ObjectiveSense
# Below is an internal function which makes sure that a provided objective sense has a valid value
# (where valid values are "min" or "max")
def validate_sense(s: str):
if s not in ("min", "max"):
raise ValueError(
f"Invalid objective sense: {repr(s)}."
f"Instead, please provide the objective sense as 'min' or 'max'."
)
if not is_sequence(objective_sense):
# If the provided objective sense is not a sequence, then convert it to a single-element list
senses = [objective_sense]
num_senses = 1
else:
# If the provided objective sense is a sequence, then take a list copy of it
senses = list(objective_sense)
num_senses = len(objective_sense)
# Ensure that each provided objective sense is valid
for sense in senses:
validate_sense(sense)
if num_senses == 0:
# If the given sequence of objective senses is empty, then we raise an error.
raise ValueError(
"Encountered an empty sequence via `objective_sense`."
" For a single-objective problem, please set `objective_sense` as 'min' or 'max'."
" For a multi-objective problem, please set `objective_sense` as a sequence,"
" each element being 'min' or 'max'."
)
# Store the objective senses
self._senses: Iterable[str] = senses
# Store the provided objective function (which can be None)
self._objective_func: Optional[Callable] = objective_func
# Store the information which indicates whether or not the given objective function is vectorized
self._vectorized: bool = bool(vectorized)
# If the evaluation data length is explicitly stated, then convert it to an integer and store it.
# Otherwise, store the evaluation data length as 0.
self._eval_data_length = 0 if eval_data_length is None else int(eval_data_length)
# Initialize the actor index.
# If the problem is configured to be parallelized and the parallelization is triggered, then each remote
# copy will have a different integer value for `_actor_index`.
self._actor_index: Optional[int] = None
# Initialize the variable that might store the list of actors as None.
# If the problem is configured to be parallelized and the parallelization is triggered, then this variable
# will store references to the remote actors (each remote actor storing its own copy of this Problem
# instance).
self._actors: Optional[list] = None
# Initialize the variable that might store the ray ActorPool.
# If the problem is configured to be parallelized and the parallelization is triggered, then this variable
# will store the ray ActorPool that is generated out of the remote actors.
self._actor_pool: Optional[ActorPool] = None
# Store the ray actor configuration dictionary provided by the user (if any).
# When (or if) the parallelization is triggered, each actor will be created with this given configuration.
self._actor_config: Optional[dict] = None if actor_config is None else deepcopy(dict(actor_config))
# If given, store the sub-batch size or number of sub-batches.
# When the problem is parallelized, a sub-batch size determines the maximum size for a SolutionBatch
# that will be sent to a remote actor for parallel solution evaluation.
# Alternatively, num_subbatches determines into how many pieces will a SolutionBatch be split
# for parallelization.
# If both are None, then the main SolutionBatch will be split among the actors.
if (num_subbatches is not None) and (subbatch_size is not None):
raise ValueError(
f"Encountered both `num_subbatches` and `subbatch_size` as values other than None."
f" num_subbatches={num_subbatches}, subbatch_size={subbatch_size}."
f" Having both of them as values other than None cannot be accepted."
)
self._num_subbatches: Optional[int] = None if num_subbatches is None else int(num_subbatches)
self._subbatch_size: Optional[int] = None if subbatch_size is None else int(subbatch_size)
# Initialize the additional states to be loaded by the remote actor as None.
# If there are such additional states for remote actors, the inheriting class can fill this as a list
# of dictionaries.
self._remote_states: Optional[Iterable[dict]] = None
# Initialize a temporary internal variable which stores the resources available in the ray cluster.
# Most probably, we are interested in the resources "CPU" and "GPU".
ray_resources: Optional[dict] = None
# The following is an internal helper function which returns the amount of availability for a given
# resource in the ray cluster.
# If the requested resource is not available at all, None will be returned.
def get_ray_resource(resource_name: str) -> Any:
# Ensure that the ray cluster is initialized
ensure_ray()
nonlocal ray_resources
if ray_resources is None:
# If the ray resource information was not fetched, then fetch them and store them.
ray_resources = ray.available_resources()
# Return the information regarding the requested resource from the fetched resource information.
# If it turns out that the requested resource is not available at all, the result will be None.
return ray_resources.get(resource_name, None)
# Annotate the variable that will store the number of actors (to be created when the parallelization
# is triggered).
self._num_actors: int
if num_actors is None:
# If the argument `num_actors` is left as None, then we set `_num_actors` as 0, which means that
# there will be no parallelization.
self._num_actors = 0
elif isinstance(num_actors, str):
# This is the case where `num_actors` has a string value
if num_actors in ("max", "num_cpus"):
# If the `num_actors` argument was given as "max" or as "num_cpus", then we first read how many CPUs
# are available in the ray cluster, then convert it to integer (via computing its ceil value), and
# finally set `_num_actors` as this integer.
self._num_actors = math.ceil(get_ray_resource("CPU"))
elif num_actors == "num_gpus":
# If the `num_actors` argument was given as "num_gpus", then we first read how many GPUs are
# available in the ray cluster.
num_gpus = get_ray_resource("GPU")
if num_gpus is None:
# If there are no GPUs at all, then we raise an error
raise ValueError(
"The argument `num_actors` was encountered as 'num_gpus'."
" However, there does not seem to be any GPU available."
)
if num_gpus < 1e-4:
# If the number of available GPUs are 0 or close to 0, then we raise an error
raise ValueError(
f"The argument `num_actors` was encountered as 'num_gpus'."
f" However, the number of available GPUs are either 0 or close to 0 (= {num_gpus})."
)
if (actor_config is not None) and ("num_gpus" in actor_config):
# With `num_actors` argument given as "num_gpus", we will also allocate each GPU to an actor.
# If `actor_config` contains an item with key "num_gpus", then that configuration item would
# conflict with the GPU allocation we are about to do here.
# So, we raise an error.
raise ValueError(
"The argument `num_actors` was encountered as 'num_gpus'."
" With this configuration, the number of GPUs assigned to an actor is automatically determined."
" However, at the same time, the `actor_config` argument was received with the key 'num_gpus',"
" which causes a conflict."
)
if num_gpus_per_actor is not None:
# With `num_actors` argument given as "num_gpus", we will also allocate each GPU to an actor.
# If the argument `num_gpus_per_actor` is also stated, then such a configuration item would
# conflict with the GPU allocation we are about to do here.
# So, we raise an error.
raise ValueError(
f"The argument `num_actors` was encountered as 'num_gpus'."
f" With this configuration, the number of GPUs assigned to an actor is automatically determined."
f" However, at the same time, the `num_gpus_per_actor` argument was received with a value other"
f" than None ({repr(num_gpus_per_actor)}), which causes a conflict."
)
# Set the number of actors as the ceiled integer counterpart of the number of available GPUs
self._num_actors = math.ceil(num_gpus)
# We assign a GPU for each actor (by overriding the value for the argument `num_gpus_per_actor`).
num_gpus_per_actor = num_gpus / self._num_actors
elif num_actors == "num_devices":
# This is the case where `num_actors` has the string value "num_devices".
# With `num_actors` set as "num_devices", if there are any GPUs, the behavior is to assign a GPU
# to each actor. If there are conflicting configurations regarding how many GPUs are to be assigned
# to each actor, then we raise an error.
if (actor_config is not None) and ("num_gpus" in actor_config):
raise ValueError(
"The argument `num_actors` was encountered as 'num_devices'."
" With this configuration, the number of GPUs assigned to an actor is automatically determined."
" However, at the same time, the `actor_config` argument was received with the key 'num_gpus',"
" which causes a conflict."
)
if num_gpus_per_actor is not None:
raise ValueError(
f"The argument `num_actors` was encountered as 'num_devices'."
f" With this configuration, the number of GPUs assigned to an actor is automatically determined."
f" However, at the same time, the `num_gpus_per_actor` argument was received with a value other"
f" than None ({repr(num_gpus_per_actor)}), which causes a conflict."
)
if self._device != torch.device("cpu"):
# If the main device is not CPU, then the user most probably wishes to put all the
# computations (both evaluations and the population) on the GPU, without allocating
# any actor.
# So, we set `_num_actors` as None, and overwrite `num_gpus_per_actor` with None.
self._num_actors = None
num_gpus_per_actor = None
else:
# If the device argument is "cpu" or left as None, then we assume that actor allocations
# might be desired.
# Read how many CPUs and GPUs are available in the ray cluster.
num_cpus = get_ray_resource("CPU")
num_gpus = get_ray_resource("GPU")
# If we have multiple CPUs, then we continue with the actor allocation procedures.
if (num_gpus is None) or (num_gpus < 1e-4):
# If there are no GPUs, then we set the number of actors as the number of CPUs, and we
# set the number of GPUs per actor as None (which means that there will be no GPU
# assignment)
self._num_actors = math.ceil(num_cpus)
num_gpus_per_actor = None
else:
# If there are GPUs available, then we compute the minimum among the number of CPUs and
# GPUs, and this minimum value becomes the number of actors (so that there can be
# one-to-one mapping between actors and GPUs).
self._num_actors = math.ceil(min(num_cpus, num_gpus))
# We assign a GPU for each actor (by overriding the value for the argument
# `num_gpus_per_actor`).
if self._num_actors <= num_gpus:
num_gpus_per_actor = 1
else:
num_gpus_per_actor = num_gpus / self._num_actors
else:
# This is the case where `num_actors` is given as an unexpected string. We raise an error here.
raise ValueError(
f"Invalid string value for `num_actors`: {repr(num_actors)}."
f" The acceptable string values for `num_actors` are 'max', 'num_cpus', 'num_gpus', 'num_devices'."
)
else:
# This is the case where `num_actors` has a value which is not a string.
# In this case, we make sure that the given value is an integer, and then use this integer as our
# number of actors.
self._num_actors = int(num_actors)
if self._num_actors == 1:
# Creating a single actor does not bring any benefit of parallelization.
# Therefore, at the end of all the computations above regarding the number of actors, if it turns out
# that the target number of actors is 1, we reduce it to 0 (meaning that no actor will be initialized).
self._num_actors = 0
# Since we are to allocate no actor, the value of the argument `num_gpus_per_actor` is meaningless.
# We therefore overwrite the value of that argument with None.
num_gpus_per_actor = None
# Annotate the variable which will determine how many GPUs are to be assigned to each actor.
self._num_gpus_per_actor: Optional[Union[str, int, float]]
if (actor_config is not None) and ("num_gpus" in actor_config) and (num_gpus_per_actor is not None):
# If `actor_config` dictionary has the item "num_gpus" and also `num_gpus_per_actor` is not None,
# then there is a conflicting (or redundant) configuration. We raise an error here.
raise ValueError(
'The `actor_config` dictionary contains the key "num_gpus".'
" At the same time, `num_gpus_per_actor` has a value other than None."
" These two configurations are conflicting."
" Please specify the number of GPUs per actor either via the `actor_config` dictionary,"
" or via the `num_gpus_per_actor` argument, but not via both."
)
if num_gpus_per_actor is None:
# If the argument `num_gpus_per_actor` is not specified, then we set the attribute
# `_num_gpus_per_actor` as None, which means that no GPUs will be assigned to the actors.
self._num_gpus_per_actor = None
elif isinstance(num_gpus_per_actor, str):
# This is the case where `num_gpus_per_actor` is given as a string.
if num_gpus_per_actor == "max":
# This is the case where `num_gpus_per_actor` is given as "max".
num_gpus = get_ray_resource("GPU")
if num_gpus is None:
# With `num_gpus_per_actor` as "max", if there is no GPU available, then we set the attribute
# `_num_gpus_per_actor` as None, which means there will be no GPU assignment to the actors.
self._num_gpus_per_actor = None
else:
# With `num_gpus_per_actor` as "max", if there are GPUs available, then the available GPUs will
# be shared among the actors.
self._num_gpus_per_actor = num_gpus / self._num_actors
elif num_gpus_per_actor == "all":
# When `num_gpus_per_actor` is "all", we also set the attribute `_num_gpus_per_actor` as "all".
# When a remote actor is initialized, the remote actor will see that the Problem instance has its
# `_num_gpus_per_actor` set as "all", and it will remove the environment variable named
# "CUDA_VISIBLE_DEVICES" in its own environment.
# With "CUDA_VISIBLE_DEVICES" removed, an actor will see all the GPUs available in its own
# environment.
self._num_gpus_per_actor = "all"
else:
# This is the case where `num_gpus_per_actor` argument has an unexpected string value.
# We raise an error.
raise ValueError(
f"Invalid string value for `num_gpus_per_actor`: {repr(num_gpus_per_actor)}."
f' Acceptable string values for `num_gpus_per_actor` are: "max", "all".'
)
elif isinstance(num_gpus_per_actor, int):
# When the argument `num_gpus_per_actor` is set as an integer we just set the attribute
# `_num_gpus_per_actor` as this integer.
self._num_gpus_per_actor = num_gpus_per_actor
else:
# For anything else, we assume that `num_gpus_per_actor` is an object that is convertible to float.
# Therefore, we convert it to float and store it in the attribute `_num_gpus_per_actor`.
# Also, remember that, when `num_actors` is given as "num_gpus" or as "num_devices",
# the code above overrides the value for the argument `num_gpus_per_actor`, which means,
# this is the case that is activated when `num_actors` is "num_gpus" or "num_devices".
self._num_gpus_per_actor = float(num_gpus_per_actor)
# Initialize the Hook instances (and the related status dictionary for the `_after_eval_hook`)
self._before_eval_hook: Hook = Hook()
self._after_eval_hook: Hook = Hook([self._get_best_and_worst])
self._after_eval_status: dict = {}
self._remote_hook: Hook = Hook()
self._before_grad_hook: Hook = Hook()
self._after_grad_hook: Hook = Hook()
# Initialize various stats regarding the solutions encountered by this Problem instance.
self._store_solution_stats = None if store_solution_stats is None else bool(store_solution_stats)
self._best: Optional[list] = None
self._worst: Optional[list] = None
self._best_evals: Optional[torch.Tensor] = None
self._worst_evals: Optional[torch.Tensor] = None
# Initialize the boolean attribute which indicates whether or not this Problem instance (which can be
# the main instance or a remote instance on an actor) is "prepared" via the `_prepare` method.
self._prepared: bool = False
def manual_seed(self, seed: Optional[int] = None):
"""
Provide a manual seed for the Problem object.
If the given seed is None, then the Problem object will remove
its own stored generator, and start using the global generator
of PyTorch instead.
If the given seed is an integer, then the Problem object will
instantiate its own generator with the given seed.
Args:
seed: None for using the global PyTorch generator; an integer
for instantiating a new PyTorch generator with this given
integer seed, specific to this Problem object.
"""
if seed is None:
self._generator = None
else:
if self._generator is None:
self._generator = torch.Generator(device=self.device)
self._generator.manual_seed(seed)
@property
def dtype(self) -> DType:
"""
dtype of the Problem object.
The decision variables of the optimization problem are of this dtype.
"""
return self._dtype
@property
def device(self) -> Device:
"""
device of the Problem object.
New solutions and populations will be generated in this device.
"""
return self._device
@property
def aux_device(self) -> Device:
"""
Auxiliary device to help with the computations, most commonly for
speeding up the solution evaluations.
An auxiliary device is different than the main device of the Problem
object (the main device being expressed by the `device` property).
While the main device of the Problem object determines where the
solutions and the populations are stored (and also using which device
should a SearchAlgorithm instance communicate with the problem),
an auxiliary device is a device that might be used by the Problem
instance itself for its own computations (e.g. computations defined
within the methods `_evaluate(...)` or `_evaluate_batch(...)`).
If the problem's main device is something other than "cpu", that main
device is also seen as the auxiliary device, and therefore returned
by this property.
If the problem's main device is "cpu", then the auxiliary device
is decided as follows. If `num_gpus_per_actor` of the Problem object
was set as "all" and if this instance is a remote instance, then the
auxiliary device is guessed as "cuda:N" where N is the actor index.
In all other cases, the auxiliary device is "cuda" if cuda is
available, and "cpu" otherwise.
"""
cpu_device = torch.device("cpu")
if torch.device(self.device) == cpu_device:
if torch.cuda.is_available():
if isinstance(self._num_gpus_per_actor, str) and (self._num_gpus_per_actor == "all") and self.is_remote:
return torch.device("cuda", self.actor_index)
else:
return torch.device("cuda")
else:
return cpu_device
else:
return self.device
@property
def eval_dtype(self) -> DType:
"""
evaluation dtype of the Problem object.
The evaluation results of the solutions are stored according to this
dtype.
"""
return self._eval_dtype
@property
def generator(self) -> Optional[torch.Generator]:
"""
Random generator used by this Problem object.
Can also be None, which means that the Problem object will use the
global random generator of PyTorch.
"""
return self._generator
@property
def has_own_generator(self) -> bool:
"""
Whether or not the Problem object has its own random generator.
If this is True, then the Problem object will use its own
random generator when creating random values or tensors.
If this is False, then the Problem object will use the global
random generator when creating random values or tensors.
"""
return self.generator is not None
@property
def objective_sense(self) -> ObjectiveSense:
"""
Get the objective sense.
If the problem is single-objective, then a single string is returned.
If the problem is multi-objective, then the objective senses will be
returned in a list.
The returned string in the single-objective case, or each returned
string in the multi-objective case, is "min" or "max".
"""
if len(self.senses) == 1:
return self.senses[0]
else:
return self.senses
@property
def senses(self) -> Iterable[str]:
"""
Get the objective senses.
The return value is a list of strings, each string being
"min" or "max".
"""
return self._senses
@property
def is_single_objective(self) -> bool:
"""Whether or not the problem is single-objective"""
return len(self.senses) == 1
@property
def is_multi_objective(self) -> bool:
"""Whether or not the problem is multi-objective"""
return len(self.senses) > 1
def get_obj_order_descending(self) -> Iterable[bool]:
"""When sorting the solutions from best to worst according to each objective i, is the ordering descending?"""
result = []
for s in self.senses:
if s == "min":
result.append(False)
elif s == "max":
result.append(True)
else:
raise ValueError(f"Invalid sense: {repr(s)}")
return result
@property
def solution_length(self) -> Optional[int]:
"""
Get the solution length.
Problems with `dtype=None` do not have solution lengths.
For such problems, this property returns None.
"""
return self._solution_length
@property
def eval_data_length(self) -> int:
"""
Length of the extra evaluation data vector for each solution.
"""
return self._eval_data_length
@property
def initial_lower_bounds(self) -> Optional[torch.Tensor]:
"""
Initial lower bounds, for when initializing a new solution.
If such a bound was declared during the initialization phase,
the returned value is a torch tensor (in the form of a vector
or in the form of a scalar).
If no such bound was declared, the returned value is None.
"""
return self._initial_lower_bounds
@property
def initial_upper_bounds(self) -> Optional[torch.Tensor]:
"""
Initial upper bounds, for when initializing a new solution.
If such a bound was declared during the initialization phase,
the returned value is a torch tensor (in the form of a vector
or in the form of a scalar).
If no such bound was declared, the returned value is None.
"""
return self._initial_upper_bounds
@property
def lower_bounds(self) -> Optional[torch.Tensor]:
"""
Lower bounds for the allowed values of a solution.
If such a bound was declared during the initialization phase,
the returned value is a torch tensor (in the form of a vector
or in the form of a scalar).
If no such bound was declared, the returned value is None.
"""
return self._lower_bounds
@property
def upper_bounds(self) -> Optional[torch.Tensor]:
"""
Upper bounds for the allowed values of a solution.
If such a bound was declared during the initialization phase,
the returned value is a torch tensor (in the form of a vector
or in the form of a scalar).
If no such bound was declared, the returned value is None.
"""
return self._upper_bounds
def generate_values(self, num_solutions: int) -> Union[torch.Tensor, ObjectArray]:
"""
Generate decision values.
This function returns a tensor containing the decision values
for `n` new solutions, `n` being the integer passed as the `num_rows`
argument.
For numeric problems, this function generates the decision values
which respect `initial_bounds` (or `bounds`, if `initial_bounds`
was not provided).
If this type of initialization is not desired, one can override
this function and define a manual initialization scheme in the
inheriting subclass.
For non-numeric problems, it is expected that the inheriting subclass
will override the method `_fill(...)`.
Args:
num_solutions: For how many solutions will new decision values be
generated.
Returns:
A PyTorch tensor for numeric problems, an ObjectArray for
non-numeric problems.
"""
result = self.make_empty(num_solutions=num_solutions)
self._fill(result)
return result
def _fill(self, values: Iterable):
"""
Fill the provided `values` tensor with new decision values.
Inheriting subclasses can override this method to specialize how
new solutions are generated.
For numeric problems, this method already has an implementation
which samples the initial decision values uniformly from the
interval expressed by `initial_bounds` attribute.
For non-numeric problems, overriding this method is mandatory.
Args:
values: The tensor which is to be filled with the new decision
values.
"""
if self.dtype is object:
raise NotImplementedError(
"The dtype of this problem is object, therefore a manual implementation of the"
" method `_fill(...)` needs to be provided by the inheriting class."
)
else:
return self.make_uniform(
out=values,
lb=self.initial_lower_bounds,
ub=self.initial_upper_bounds,
)
def generate_batch(
self,
popsize: Optional[int] = None,
*,
empty: bool = False,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
symmetric: bool = False,
) -> "SolutionBatch":
"""
Generate a new SolutionBatch.
Args:
popsize: Number of solutions that will be contained in the new
batch.
empty: Set this as True if you would like to receive the solutions
un-initialized.
center: Center point of the Gaussian distribution from which
the decision values will be sampled, as a scalar or as a
1-dimensional vector.
Can also be left as None.
If `center` is None and `stdev` is None, all the decision
values will be sampled from the interval specified by
`initial_bounds` (or by `bounds` if `initial_bounds` was not
specified).
If `center` is None and `stdev` is not None, a center point
will be sampled from within the interval specified by
`initial_bounds` or `bounds`, and the decision values will be
sampled from a Gaussian distribution around this center point.
stdev: Can be None (default) if the SolutionBatch is to contain
decision values sampled from the interval specified by
`initial_bounds` (or by `bounds` if `initial_bounds` was not
provided during the initialization phase).
Alternatively, a scalar or a 1-dimensional vector specifying
the standard deviation of the Gaussian distribution from which
the decision values will be sampled.
symmetric: To be used only when `stdev` is not None.
If `symmetric` is True, decision values will be sampled from
the Gaussian distribution in a symmetric (i.e. antithetic)
manner.
Otherwise, the decision values will be sampled in the
non-antithetic manner.
"""
if (center is None) and (stdev is None):
if symmetric:
raise ValueError(
f"The argument `symmetric` can be set as True only when `center` and `stdev` are provided."
f" Although `center` and `stdev` are None, `symmetric` was received as {symmetric}."
)
return SolutionBatch(self, popsize, empty=empty, device=self.device)
elif (center is not None) and (stdev is not None):
if empty:
raise ValueError(
f"When `center` and `stdev` are provided, the argument `empty` must be False."
f" However, the received value for `empty` is {empty}."
)
result = SolutionBatch(self, popsize, device=self.device, empty=True)
self.make_gaussian(out=result.access_values(), center=center, stdev=stdev, symmetric=symmetric)
else:
raise ValueError(
f"The arguments `center` and `stdev` were expected to be None or non-None at the same time."
f" Received `center`: {center}."
f" Received `stdev`: {stdev}."
)
def _parallelize(self):
"""Create ray actors for parallelizing the solution evaluations."""
# If the problem was explicitly configured for
# NOT having parallelization, leave this function.
if (not isinstance(self._num_actors, str)) and (self._num_actors <= 0):
return
# If this problem object is a remote one,
# leave this function
# (because we do not want the remote worker
# to parallelize itself further)
if self._actor_index is not None:
return
# If the actors list is not None, then this means
# that the initialization of the parallelization mechanism
# was already completed. So, leave this function.
if self._actors is not None:
return
# Make sure that ray is initialized
ensure_ray()
number_of_actors = self._num_actors
# numpy's RandomState uses 32-bit unsigned integers
# for random seeds.
# So, the following value is the exclusive upper bound
# for a random seed.
supremum_seed = 2**32
# Generate an integer from the main problem object's
# random_state. From this integer, further seed integers
# will be computed, and these generated seeds will be
# used by the remote actors.
base_seed = int(self.make_randint(tuple(), n=supremum_seed))
# The following function returns a seed number for the actor
# number i.
def generate_actor_seed(i):
nonlocal base_seed, supremum_seed
return (base_seed + (i + 1)) % supremum_seed
all_seeds = []
j = 0
for i in range(number_of_actors):
actor_seeds = []
for _ in range(4):
actor_seeds.append(generate_actor_seed(j))
j += 1
all_seeds.append(tuple(actor_seeds))
if self._remote_states is None:
remote_states = [{} for _ in range(number_of_actors)]
else:
remote_states = self._remote_states
# Prepare the necessary actor config
config_per_actor = {}
if self._actor_config is not None:
config_per_actor.update(self._actor_config)
if isinstance(self._num_gpus_per_actor, (int, float)):
config_per_actor["num_gpus"] = self._num_gpus_per_actor
# Generate the actors, each with a unique seed.
if config_per_actor is None:
actors = [EvaluationActor.remote(self, i, all_seeds[i], remote_states[i]) for i in range(number_of_actors)]
else:
actors = [
EvaluationActor.options(**config_per_actor).remote(self, i, all_seeds[i], remote_states[i])
for i in range(number_of_actors)
]
self._actors = actors
self._actor_pool = ActorPool(self._actors)
self._remote_states = None
def all_remote_problems(self) -> AllRemoteProblems:
"""
Get an accessor which is used for running a method
on all remote clones of this Problem object.
For example, given a Problem object named `my_problem`,
also assuming that this Problem object is parallelized,
and therefore has `n` remote actors, a method `f()`
can be executed on all the remote instances as follows:
results = my_problem.all_remote_problems().f()
The variable `results` is a list of length `n`, the i-th
item of the list belonging to the method f's result
from the i-th actor.
Returns:
A method accessor for all the remote Problem objects.
"""
self._parallelize()
if self.is_remote:
raise RuntimeError(
"The method `all_remote_problems()` can only be used on the main (i.e. non-remote)"
" Problem instance."
" However, this Problem instance is on a remote actor."
)
return AllRemoteProblems(self._actors)
def all_remote_envs(self) -> AllRemoteEnvs:
"""
Get an accessor which is used for running a method
on all remote reinforcement learning environments.
This method can only be used on parallelized Problem
objects which have their `get_env()` methods defined.
For example, one can use this feature on a parallelized
GymProblem.
As an example, let us consider a parallelized GymProblem
object named `my_problem`. Given that `my_problem` has
`n` remote actors, a method `f()` can be executed
on all remote reinforcement learning environments as
follows:
results = my_problem.all_remote_envs().f()
The variable `results` is a list of length `n`, the i-th
item of the list belonging to the method f's result
from the i-th actor.
Returns:
A method accessor for all the remote reinforcement
learning environments.
"""
self._parallelize()
if self.is_remote:
raise RuntimeError(
"The method `all_remote_envs()` can only be used on the main (i.e. non-remote)"
" Problem instance."
" However, this Problem instance is on a remote actor."
)
return AllRemoteEnvs(self._actors)
def kill_actors(self):
"""
Kill all the remote actors used by the Problem instance.
One might use this method to release the resources used by the
remote actors.
"""
if not self.is_main:
raise RuntimeError(
"The method `kill_actors()` can only be used on the main (i.e. non-remote)"
" Problem instance."
" However, this Problem instance is on a remote actor."
)
for actor in self._actors:
ray.kill(actor)
self._actors = None
self._actor_pool = None
@property
def num_actors(self) -> int:
"""
Number of actors (to be) used for parallelization.
If the problem is configured for no parallelization,
the result will be 0.
"""
return self._num_actors
@property
def actors(self) -> Optional[list]:
"""
Get the ray actors, if the Problem object is distributed.
If the Problem object is not distributed and therefore
has no actors, then, the result will be None.
"""
return self._actors
@property
def actor_index(self) -> Optional[int]:
"""Return the actor index if this is a remote worker.
If this is not a remote worker, return None.
"""
return self._actor_index
@property
def is_remote(self) -> bool:
"""Returns True if this problem object lives in a remote ray actor.
Otherwise, returns False.
"""
return self._actor_index is not None
@property
def is_main(self) -> bool:
"""Returns True if this problem object lives in the main process
and not in a remote actor.
Otherwise, returns False.
"""
return self._actor_index is None
@property
def before_eval_hook(self) -> Hook:
"""
Get the Hook which stores the functions to call just before
evaluating a SolutionBatch.
The functions to be stored in this hook are expected to
accept one positional argument, that one argument being the
SolutionBatch which is about to be evaluated.
"""
return self._before_eval_hook
@property
def after_eval_hook(self) -> Hook:
"""
Get the Hook which stores the functions to call just after
evaluating a SolutionBatch.
The functions to be stored in this hook are expected to
accept one argument, that one argument being the SolutionBatch
whose evaluation has just been completed.
The dictionaries returned by the functions in this hook
are accumulated, and reported in the status dictionary of this
problem object.
"""
return self._after_eval_hook
@property
def before_grad_hook(self) -> Hook:
"""
Get the Hook which stores the functions to call just before
its `sample_and_compute_gradients(...)` operation.
"""
return self._before_grad_hook
@property
def after_grad_hook(self) -> Hook:
"""
Get the Hook which stores the functions to call just after
its `sample_and_compute_gradients(...)` operation.
The functions to be stored in this hook are expected to
accept one argument, that one argument being the gradients
dictionary (which was produced by the Problem object,
but not yet followed by the search algorithm).
The dictionaries returned by the functions in this hook
are accumulated, and reported in the status dictionary of this
problem object.
"""
return self._after_grad_hook
@property
def remote_hook(self) -> Hook:
"""
Get the Hook which stores the functions to call when this
Problem object is (re)created on a remote actor.
The functions in this hook should expect one positional
argument, that is the Problem object itself.
"""
return self._remote_hook
def _make_sync_data_for_actors(self) -> Any:
"""
Override this function for providing synchronization between
the main process and the remote actors.
The responsibility of this function is to prepare and return the
data to be sent to the remote actors for synchronization.
If this function returns NotImplemented, then there will be no
syncing.
If this function returns None, there will be no data sent to the
actors for syncing, however, syncing will still be enabled, and
the main actor will ask for sync data from the remote actors
after their jobs are finished.
"""
return NotImplemented
def _use_sync_data_from_main(self, received: Any):
"""
Override this function for providing synchronization between
the main process and the remote actors.
The responsibility of this function is to update the state
of the remote Problem object according to the synchronization
data received by the main process.
"""
pass
def _make_sync_data_for_main(self) -> Any:
"""
Override this function for providing synchronization between
the main process and the remote actors.
The responsibility of this function is to prepare and return the
data to be sent to the main Problem object by a remote actor.
"""
return NotImplemented
def _use_sync_data_from_actors(self, received: list):
"""
Override this function for providing synchronization between
the main process and the remote actors.
The responsibility of this function is to update the state
of the main Problem object according to the synchronization
data received by the remote actors.
"""
pass
def _make_pickle_data_for_main(self) -> dict:
"""
Override this function for preserving the state of a remote
actor in the main state dictionary when pickling a parallelized
problem.
The responsibility of this function is to return the state
of a problem object which lives in a remote actor.
If the remote clones of this problem do not need to be stateful
then you probably do not need to override this method.
"""
return {}
def _use_pickle_data_from_main(self, state: dict):
"""
Override this function for re-creating the internal state of
a problem instance living in a remote actor, by using the
given state dictionary.
If the remote clones of this problem do not need to be stateful
then you probably do not need to override this method.
"""
pass
def _sync_before(self) -> bool:
if self._actors is None:
return False
to_send = self._make_sync_data_for_actors()
if to_send is NotImplemented:
return False
if to_send is not None:
ray.get([actor.call.remote("_use_sync_data_from_main", [to_send], {}) for actor in self._actors])
return True
def _sync_after(self):
if self._actors is None:
return
received = ray.get([actor.call.remote("_make_sync_data_for_main", [], {}) for actor in self._actors])
self._use_sync_data_from_actors(received)
@torch.no_grad()
def _get_best_and_worst(self, batch: "SolutionBatch") -> Optional[dict]:
if self._store_solution_stats is None:
self._store_solution_stats = str(batch.device) == "cpu"
if not self._store_solution_stats:
return {}
senses = self.senses
nobjs = len(senses)
if self._best is None:
self._best_evals = self.make_empty(nobjs, device=batch.device, use_eval_dtype=True)
self._worst_evals = self.make_empty(nobjs, device=batch.device, use_eval_dtype=True)
for i_obj in range(nobjs):
if senses[i_obj] == "min":
self._best_evals[i_obj] = float("inf")
self._worst_evals[i_obj] = float("-inf")
elif senses[i_obj] == "max":
self._best_evals[i_obj] = float("-inf")
self._worst_evals[i_obj] = float("inf")
else:
raise ValueError(f"Invalid sense: {senses[i_obj]}")
self._best = [None] * nobjs
self._worst = [None] * nobjs
def first_is_better(a, b, i_obj):
if senses[i_obj] == "min":
return a < b
elif senses[i_obj] == "max":
return a > b
else:
raise ValueError(f"Invalid sense: {senses[i_obj]}")
def first_is_worse(a, b, i_obj):
if senses[i_obj] == "min":
return a > b
elif senses[i_obj] == "max":
return a < b
else:
raise ValueError(f"Invalid sense: {senses[i_obj]}")
best_sln_indices = [batch.argbest(i) for i in range(nobjs)]
worst_sln_indices = [batch.argworst(i) for i in range(nobjs)]
for i_obj in range(nobjs):
best_sln_index = best_sln_indices[i_obj]
worst_sln_index = worst_sln_indices[i_obj]
scores = batch.access_evals(i_obj)
best_score = scores[best_sln_index]
worst_score = scores[worst_sln_index]
if first_is_better(best_score, self._best_evals[i_obj], i_obj):
self._best_evals[i_obj] = best_score
self._best[i_obj] = batch[best_sln_index].clone()
if first_is_worse(worst_score, self._worst_evals[i_obj], i_obj):
self._worst_evals[i_obj] = worst_score
self._worst[i_obj] = batch[worst_sln_index].clone()
result = {}
if len(senses) == 1:
return dict(
best=self._best[0],
worst=self._worst[0],
best_eval=float(self._best[0].evaluation),
worst_eval=float(self._worst[0].evaluation),
)
else:
return {"best": self._best, "worst": self._worst}
def compare_solutions(self, a: "Solution", b: "Solution", obj_index: Optional[int] = None) -> float:
"""
Compare two solutions.
It is assumed that both solutions are already evaluated.
Args:
a: The first solution.
b: The second solution.
obj_index: The objective index according to which the comparison
will be made.
Can be left as None if the problem is single-objective.
Returns:
A positive number if `a` is better;
a negative number if `b` is better;
0 if there is a tie.
"""
senses = self.senses
obj_index = self.normalize_obj_index(obj_index)
sense = senses[obj_index]
def score(s: Solution):
return s.evals[obj_index]
if sense == "max":
return score(a) - score(b)
elif sense == "min":
return score(b) - score(a)
else:
raise ValueError("Unrecognized sense: " + repr(sense))
def is_better(self, a: "Solution", b: "Solution", obj_index: Optional[int] = None) -> bool:
"""
Check whether or not the first solution is better.
It is assumed that both solutions are already evaluated.
Args:
a: The first solution.
b: The second solution.
obj_index: The objective index according to which the comparison
will be made.
Can be left as None if the problem is single-objective.
Returns:
True if `a` is better; False otherwise.
"""
return self.compare_solutions(a, b, obj_index) > 0
def is_worse(self, a: "Solution", b: "Solution", obj_index: Optional[int] = None) -> bool:
"""
Check whether or not the first solution is worse.
It is assumed that both solutions are already evaluated.
Args:
a: The first solution.
b: The second solution.
obj_index: The objective index according to which the comparison
will be made.
Can be left as None if the problem is single-objective.
Returns:
True if `a` is worse; False otherwise.
"""
return self.compare_solutions(a, b, obj_index) < 0
def _prepare(self) -> None:
"""Prepare a worker instance of the problem for evaluation. To be overridden by the user"""
pass
def _prepare_main(self) -> None:
"""Prepare the main instance of the problem for evaluation."""
self._share_attributes()
def _start_preparations(self) -> None:
"""Prepare the problem for evaluation. Calls self._prepare() if the self._prepared flag is not True."""
if not self._prepared:
if self.actors is None or self._num_actors == 0:
# Call prepare method for any problem class that is expected to do work
self._prepare()
if self.is_main:
# Call share method to distribute shared attributes to actors
self._prepare_main()
self._prepared = True
@property
def _nonserialized_attribs(self) -> List[str]:
return []
def _share_attributes(self) -> None:
if (self._actors is not None) and (len(self._actors) > 0):
for attrib_name in self._shared_attribs:
obj_ref = ray.put(getattr(self, attrib_name))
for actor in self.actors:
actor.call.remote("put_ray_object", [], {"obj_ref": obj_ref, "attrib_name": attrib_name})
def put_ray_object(self, obj_ref: ray.ObjectRef, attrib_name: str) -> None:
setattr(self, attrib_name, ray.get(obj_ref))
@property
def _shared_attribs(self) -> List[str]:
return []
def evaluate(self, x: Union["SolutionBatch", "Solution"]):
"""
Evaluate the given Solution or SolutionBatch.
Args:
batch: The SolutionBatch to be evaluated.
"""
if isinstance(x, Solution):
batch = x.to_batch()
elif isinstance(x, SolutionBatch):
batch = x
else:
raise TypeError(
f"The method `evaluate(...)` expected a Solution or a SolutionBatch as its argument."
f" However, the received object is {repr(x)}, which is of type {repr(type(x))}."
)
self._parallelize()
if self.is_main:
self.before_eval_hook(batch)
must_sync_after = self._sync_before()
self._start_preparations()
self._evaluate_all(batch)
if must_sync_after:
self._sync_after()
if self.is_main:
self._after_eval_status = self.after_eval_hook.accumulate_dict(batch)
def _evaluate_all(self, batch: "SolutionBatch"):
if self._actors is None:
self._evaluate_batch(batch)
else:
if self._num_subbatches is not None:
pieces = batch.split(self._num_subbatches)
elif self._subbatch_size is not None:
pieces = batch.split(max_size=self._subbatch_size)
else:
pieces = batch.split(len(self._actors))
# mapresult = self._actor_pool.map(lambda a, v: a.evaluate_batch.remote(v), list(pieces))
# for i, evals in enumerate(mapresult):
# row_begin, row_end = pieces.indices_of(i)
# batch._evdata[row_begin:row_end, :] = evals
mapresult = self._actor_pool.map_unordered(
lambda a, v: a.evaluate_batch_piece.remote(v[0], v[1]), list(enumerate(pieces))
)
for i, evals in mapresult:
row_begin, row_end = pieces.indices_of(i)
batch._evdata[row_begin:row_end, :] = evals
def _evaluate_batch(self, batch: "SolutionBatch"):
if self._vectorized and (self._objective_func is not None):
result = self._objective_func(batch.values)
if isinstance(result, tuple):
batch.set_evals(*result)
else:
batch.set_evals(result)
else:
for sln in batch:
self._evaluate(sln)
def _evaluate(self, solution: "Solution"):
if self._objective_func is not None:
result = self._objective_func(solution.values)
if isinstance(result, tuple):
solution.set_evals(*result)
else:
solution.set_evals(result)
else:
raise NotImplementedError
@property
def stores_solution_stats(self) -> Optional[bool]:
"""
Whether or not the best and worst solutions are kept.
"""
return self._store_solution_stats
@property
def status(self) -> dict:
"""
Status dictionary of the problem object, updated after the last
evaluation operation.
The dictionaries returned by the functions in `after_eval_hook`
are accumulated, and reported in this status dictionary.
"""
return self._after_eval_status
def ensure_numeric(self):
"""
Ensure that the problem has a numeric dtype.
Raises:
ValueError: if the problem has a non-numeric dtype.
"""
if is_dtype_object(self.dtype):
raise ValueError("Expected a problem with numeric dtype, but the dtype is object.")
def ensure_unbounded(self):
"""
Ensure that the problem has no strict lower and upper bounds.
Raises:
ValueError: if the problem has strict lower and upper bounds.
"""
if not (self.lower_bounds is None and self.upper_bounds is None):
raise ValueError("Expected an unbounded problem, but this problem has lower and/or upper bounds.")
def ensure_single_objective(self):
"""
Ensure that the problem has only one objective.
Raises:
ValueError: if the problem is multi-objective.
"""
n = len(self.senses)
if n > 1:
raise ValueError(f"Expected a single-objective problem, but this problem has {n} objectives.")
def normalize_obj_index(self, obj_index: Optional[int] = None) -> int:
"""
Normalize the objective index.
If the provided index is non-negative, it is ensured that the index
is valid.
If the provided index is negative, the objectives are counted in the
reverse order, and the corresponding non-negative index is returned.
For example, -1 is converted to a non-negative integer corresponding to
the last objective.
If the provided index is None and if the problem is single-objective,
the returned value is 0, which represents the only objective.
If the provided index is None and if the problem is multi-objective,
an error is raised.
Args:
obj_index: The non-normalized objective index.
Returns:
The normalized objective index, as a non-negative integer.
"""
if obj_index is None:
if len(self.senses) == 1:
return 0
else:
raise ValueError(
"This problem is multi-objective, therefore, an explicit objective index was expected."
" However, `obj_index` was found to be None."
)
else:
obj_index = int(obj_index)
if obj_index < 0:
obj_index = len(self.senses) + obj_index
if obj_index < 0 or obj_index >= len(self.senses):
raise IndexError("Objective index out of range.")
return obj_index
def __getstate__(self):
# Collect the inner states of the remote Problem clones
if self._actors is not None:
self._remote_states = ray.get(
[actor.call.remote("_make_pickle_data_for_main", [], {}) for actor in self._actors]
)
# Prepare the main state dictionary
result = {}
for k, v in self.__dict__.items():
if k in ("_actors", "_actor_pool") or k in self._nonserialized_attribs:
result[k] = None
else:
result[k] = v
return result
def clone(self, memo: Optional[dict] = None) -> "Problem":
"""
Get a clone of the Problem object.
"""
print("Clone")
cloned = object.__new__(type(self))
cloned.__dict__.update(deepcopy(self.__getstate__(), memo))
return cloned
def _get_local_interaction_count(self) -> int:
"""
Get the number of simulator interactions this Problem encountered.
For problems focused on reinforcement learning, it is expected
that the subclass overrides this method to describe its own way
of getting the local interaction count.
When working on parallelized problems, what is returned here is
not necessarily synchronized with the other parallelized instance.
"""
raise NotImplementedError
def _get_local_episode_count(self) -> int:
"""
Get the number of episodes this Problem encountered.
For problems focused on reinforcement learning, it is expected
that the subclass overrides this method to describe its own way
of getting the local episode count.
When working on parallelized problems, what is returned here is
not necessarily synchronized with the other parallelized instance.
"""
raise NotImplementedError
def sample_and_compute_gradients(
self,
distribution,
popsize: int,
*,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
obj_index: Optional[int] = None,
ranking_method: Optional[str] = None,
with_stats: bool = True,
ensure_even_popsize: bool = False,
) -> Union[list, dict]:
"""
Sample new solutions from the distribution and compute gradients.
The distribution can then be updated according to the computed
gradients.
If the problem is not parallelized, and `with_stats` is False,
then the result will be a single dictionary of gradients.
For example, in the case of a Gaussian distribution, the returned
gradients dictionary would look like this:
{
"mu": ..., # the gradient for the mean
"sigma": ..., # the gradient for the standard deviation
}
If the problem is not parallelized, and `with_stats` is True,
then the result will be a dictionary which contains in itself
the gradients dictionary, and additional elements for providing
further information. In the case of a Gaussian distribution,
the returned dictionary with additional stats would look like
this:
{
"gradients": {
"mu": ..., # the gradient for the mean
"sigma": ..., # the gradient for the standard deviation
},
"num_solutions": ..., # how many solutions were sampled
"mean_eval": ..., # Mean of all evaluations
}
If the problem is parallelized, then the gradient computation will
be distributed among the remote actors. In more details, each actor
will sample its own solutions (such that the total population size
across all remote actors will be near the provided `popsize`)
and will compute its own gradients, and will produce its own
additional stats (if `with_stats` is given as True).
These remote results will then be collected by the main process,
and the final result of this method will be a list of dictionaries,
each dictionary being the result of a remote gradient computation.
The sampled solutions are temporary, and will not be kept
(and will not be returned).
To customize how solutions are sampled and how gradients are
computed, one is encouraged to override
`_sample_and_compute_gradients(...)` (instead of overriding this
method directly.
Args:
distribution: The search distribution from which the solutions
will be sampled, and according to which the gradients will
be computed.
popsize: The number of solutions which will be sampled.
num_interactions: Number of simulator interactions that must
be completed (more solutions will be sampled until this
threshold is reached). This argument is to be used when
the problem has characteristics similar to reinforcement
learning, and an adaptive population size, depending on
the interactions made, is desired.
Otherwise, one can leave this argument as None, in which
case, there will not be any threshold based on number
of interactions.
popsize_max: To be used when `num_interactions` is provided,
as an additional criterion for ending the solution sampling
phase. This argument can be used to prevent the population
size from growing too much while trying to satisfy the
`num_interactions`. If not needed, `popsize_max` can be left
as None.
obj_index: Index of the objective according to which the gradients
will be computed. Can be left as None if the problem has only
one objective.
ranking_method: The solution ranking method to be used when
computing the gradients.
If not specified, the raw fitnesses will be used.
with_stats: If given as False, then the results dictionary will
only contain the gradients information. If given as True,
then the results dictionary will contain within itself
the gradients dictionary, and also additional elements for
providing further information.
The default is True.
ensure_even_popsize: If `ensure_even_popsize` is True and the
problem is not parallelized, then a `popsize` given as an odd
number will cause an error. If `ensure_even_popsize` is True
and the problem is parallelized, then the remote actors will
sample their own sub-populations in such a way that their
sizes are even.
If `ensure_even_popsize` is False, whether or not the
`popsize` is even will not be checked.
When the provided `distribution` is a symmetric (or
"mirrored", or "antithetic"), then this argument must be
given as True.
Returns:
A results dictionary when the problem is not parallelized,
or list of results dictionaries when the problem is parallelized.
"""
# For problems which are configured for parallelization, make sure that the actors are created.
self._parallelize()
# Below we check if there is an inconsistency in arguments.
if (num_interactions is None) and (popsize_max is not None):
# If `num_interactions` is None, then we assume that the user does not wish an adaptive population size.
# However, at the same time, if `popsize_max` is not None, then there is an inconsistency,
# because, `popsize_max` without `num_interactions` (therefore without adaptive population size)
# does not make sense.
# This is probably a configuration error, so, we inform the user by raising an error.
raise ValueError(
f"`popsize_max` was expected as None, because `num_interactions` is None."
f" However, `popsize_max` was found as {popsize_max}."
)
# The problem instance in the main process should trigger the `before_grad_hook`.
if self.is_main:
self._before_grad_hook()
if self.is_main and (self._actors is not None) and (len(self._actors) > 0):
# If this is the main process and the problem is parallelized, then we need to split the request
# into multiple tasks, and then execute those tasks in parallel using the problem's actor pool.
if self._subbatch_size is not None:
# If `subbatch_size` is provided, then we first make sure that `popsize` is divisible by
# `subbatch_size`
if (popsize % self._subbatch_size) != 0:
raise ValueError(
f"This Problem was created with `subbatch_size` as {self._subbatch_size}."
f" When doing remote gradient computation, the requested population size must be divisible by"
f" the `subbatch_size`."
f" However, the requested population size is {popsize}, and the remainder after dividing it"
f" by `subbatch_size` is not 0 (it is {popsize % self._subbatch_size})."
)
# After making sure that `popsize` and `subbatch_size` configurations are compatible, we declare that
# we are going to have n tasks, each task imposing a sample size of `subbatch_size`.
n = int(popsize // self._subbatch_size)
popsize_per_task = [self._subbatch_size for _ in range(n)]
elif self._num_subbatches is not None:
# If `num_subbatches` is provided, then we are going to have n tasks where n is equal to the given
# `num_subbatches`.
popsize_per_task = split_workload(popsize, self._num_subbatches)
else:
# If neither `subbatch_size` nor `num_subbatches` is given, then we will split the workload in such
# a way that each actor will have its share.
popsize_per_task = split_workload(popsize, len(self._actors))
if ensure_even_popsize:
# If `ensure_even_popsize` argument is True, then we need to make sure that each tasks's popsize is
# an even number.
for i in range(len(popsize_per_task)):
if (popsize_per_task[i] % 2) != 0:
# If the i-th actor's assigned popsize is not even, increase its assigned popsize by 1.
popsize_per_task[i] += 1
# The number of tasks is finally determined by the length of `popsize_per_task` list we created above.
num_tasks = len(popsize_per_task)
if num_interactions is None:
# If the argument `num_interactions` is not given, then, for each task, we declare that
# `num_interactions` is None.
num_inter_per_task = [None for _ in range(num_tasks)]
else:
# If the argument `num_interactions` is given, then we compute each task's target number of
# interactions from its sample size.
num_inter_per_task = [
math.ceil((popsize_per_task[i] / popsize) * num_interactions) for i in range(num_tasks)
]
if popsize_max is None:
# If the argument `popsize_max` is not given, then, for each task, we declare that
# `popsize_max` is None.
popsize_max_per_task = [None for _ in range(num_tasks)]
else:
# If the argument `popsize_max` is given, then we compute each task's target maximum population size
# from its sample size.
popsize_max_per_task = [
math.ceil((popsize_per_task[i] / popsize) * popsize_max) for i in range(num_tasks)
]
# We trigger the synchronization between the main process and the remote actors.
# If this problem instance has nothing to synchronize, then `must_sync_after` will be False.
must_sync_after = self._sync_before()
# Because we want to send the distribution to remote actors, we first copy the distribution to cpu
# (unless it is already on cpu)
dist_on_cpu = distribution.to("cpu")
# Here, we use our actor pool to execute our tasks in parallel.
result = list(
self._actor_pool.map_unordered(
(
lambda a, v: a.call.remote(
"_sample_and_compute_gradients",
[dist_on_cpu, v[0]],
{
"obj_index": obj_index,
"num_interactions": v[1],
"popsize_max": v[2],
"ranking_method": ranking_method,
},
)
),
list(zip(popsize_per_task, num_inter_per_task, popsize_max_per_task)),
)
)
# At this point, all the tensors within our collected results are on the CPU.
if torch.device(self.device) != torch.device("cpu"):
# If the main device of this problem instance is not CPU, then we move the tensors to the main device.
result = cast_tensors_in_container(mapresult, device=device)
if must_sync_after:
# If a post-gradient synchronization is required, we trigger the synchronization operations.
self._sync_after()
# ####################################################
# # If this is the main process and the problem is parallelized, then we need to split the workload among
# # the remote actors, and then request each of them to compute their gradients.
#
# # We begin by getting the number of actors, and computing the `popsize` for each actor.
# num_actors = len(self._actors)
# popsize_per_actor = split_workload(popsize, num_actors)
#
# if ensure_even_popsize:
# # If `ensure_even_popsize` argument is True, then we need to make sure that each actor's popsize is
# # an even number.
# for i in range(len(popsize_per_actor)):
# if (popsize_per_actor[i] % 2) != 0:
# # If the i-th actor's assigned popsize is not even, increase its assigned popsize by 1.
# popsize_per_actor[i] += 1
#
# if num_interactions is None:
# # If `num_interactions` is None, then the `num_interactions` argument for each actor must also be
# # passed as None.
# num_int_per_actor = [None] * num_actors
# else:
# # If `num_interactions` is not None, then we split the `num_interactions` workload among the actors.
# num_int_per_actor = split_workload(num_interactions, num_actors)
#
# if popsize_max is None:
# # If `popsize_max` is None, then the `popsize_max` argument for each actor must also be None.
# popsize_max_per_actor = [None] * num_actors
# else:
# # If `popsize_max` is not None, then we split the `popsize_max` workload among the actors.
# popsize_max_per_actor = split_workload(popsize_max, num_actors)
#
# # We trigger the synchronization between the main process and the remote actors.
# # If this problem instance has nothing to synchronize, then `must_sync_after` will be False.
# must_sync_after = self._sync_before()
#
# # Because we want to send the distribution to remote actors, we first copy the distribution to cpu
# # (unless it is already on cpu)
# dist_on_cpu = distribution.to("cpu")
#
# # To each actor, we send the request of computing the gradients, and then collect the results
# result = ray.get(
# [
# self._actors[i].call.remote(
# "_gradient_computation_helper",
# [dist_on_cpu, popsize_per_actor[i]],
# dict(
# num_interactions=num_int_per_actor[i],
# popsize_max=popsize_max_per_actor[i],
# obj_index=obj_index,
# ranking_method=ranking_method,
# with_stats=with_stats,
# move_results_to_device="cpu",
# ),
# )
# for i in range(num_actors)
# ]
# )
#
# # At this point, all the tensors within our collected results are on the CPU.
#
# if torch.device(self.device) != torch.device("cpu"):
# # If the main device of this problem instance is not CPU, then we move the tensors to the main device.
# result = cast_tensors_in_container(result, device=device)
#
# if must_sync_after:
# # If a post-gradient synchronization is required, we trigger the synchronization operations.
# self._sync_after()
else:
# If the problem is not parallelized, then we request this instance itself to compute the gradients.
result = self._gradient_computation_helper(
distribution,
popsize,
popsize_max=popsize_max,
obj_index=obj_index,
ranking_method=ranking_method,
num_interactions=num_interactions,
with_stats=with_stats,
)
# The problem instance in the main process should trigger the `after_grad_hook`.
if self.is_main:
self._after_eval_status = self._after_grad_hook.accumulate_dict(result)
# We finally return the results
return result
def _gradient_computation_helper(
self,
distribution,
popsize: int,
*,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
obj_index: Optional[int] = None,
ranking_method: Optional[str] = None,
with_stats: bool = True,
move_results_to_device: Optional[Device] = None,
) -> dict:
# This is a helper method which makes sure that the provided distribution is in the correct dtype and device.
# This method also makes sure that the results are moved to the desired device.
# At first, we make sure that the objective index is normalized
# (for example, the objective -1 is converted to the index of the last objective).
obj_index = self.normalize_obj_index(obj_index)
if (distribution.dtype != self.dtype) or (distribution.device != self.device):
# Make sure that the distribution is in the correct dtype and device
distribution = distribution.modified_copy(dtype=self.dtype, device=self.device)
# Call the protected method responsible for sampling solutions and computing the gradients
result = self._sample_and_compute_gradients(
distribution,
popsize,
popsize_max=popsize_max,
obj_index=obj_index,
num_interactions=num_interactions,
ranking_method=ranking_method,
)
if move_results_to_device is not None:
# If `move_results_to_device` is provided, move the results to the desired device
result = cast_tensors_in_container(result, device=move_results_to_device)
# Finally, return the result
if with_stats:
return result
else:
return result["gradients"]
@property
def _grad_device(self) -> Device:
"""
Get the device in which new solutions will be made in distributed mode.
In more details, in distributed mode, each actor creates its own
sub-populations, evaluates them, and computes its own gradient
(all such actor gradients eventually being collected by the
distribution-based search algorithm in the main process).
For some problem types, it can make sense for the remote actors to
create their temporary sub-populations on another device
(e.g. on the GPU that is allocated specifically for them).
For such situations, one is encouraged to override this property
and make it return whatever device is to be used.
Note that this property is used by the default implementation of the
method named `_sample_and_compute_grad(...)`. If the method named
`_sample_and_compute_grad(...)` is overriden, this property might not
be called at all.
This is the not-yet-overriden implementation in the Problem class,
and returns the main device.
"""
return self.device
def _sample_and_compute_gradients(
self,
distribution,
popsize: int,
*,
obj_index: int,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
ranking_method: Optional[str] = None,
) -> dict:
"""
This method contains the description of how the solutions are sampled
and the gradients are computed according to the given distribution.
One might override this method for customizing the procedure of
sampling solutions and the gradient computation, but this method does
have a default implementation.
This returns a dictionary which contains the gradients for the given
distribution, and also further information. For example, considering
a Gaussian distribution with parameters 'mu' and 'sigma', the result
is expected to look like this:
{
"gradients": {
"mu": ..., # the gradient for the mean (tensor)
"sigma": ..., # the gradient for the std.dev. (tensor)
},
"num_solutions": ..., # how many solutions were sampled (int)
"mean_eval": ..., # Mean of all evaluations (float)
}
A customized version of this method can add more items to the outer
dictionary.
Args:
distribution: The search distribution from which the solutions
will be sampled and according to which the gradients will
be computed. This method assumes that `distribution` is
given with this problem instance's dtype, and in this problem
instance's device.
popsize: Number of solutions to sample.
obj_index: Objective index, expected as an integer.
num_interactions: Number of simulator interactions that must be
reached before computing the gradients.
Having this argument as an integer implies that adaptive
population is requested: more solutions are to be sampled
until this number of simulator interactions are made.
Can also be None if this threshold is not needed.
popsize_max: Maximum population size for when the population
size is adaptive (where the adaptiveness is enabled when
`num_interactions` is not None).
Can be left as None if a maximum population size limit
is not needed.
ranking_method: Ranking method to be used when computing the
gradients. Can be left as None, in which case the raw
fitnesses will be used.
Returns:
A dictionary which contains the gradients, number of solutions,
mean of all the evaluation results, and optionally further
items (if customized to do so).
"""
# Annotate the variable which will store the temporary SolutionBatch for computing the local gradient.
resulting_batch: SolutionBatch
# Get the device in which the new solutions will be made.
grad_device = torch.device(self._grad_device)
distribution = distribution.to(grad_device)
# Below we define an inner utility function which samples and evaluates a new SolutionBatch.
# This newly evaluated SolutionBatch is returned.
def sample_evaluated_batch() -> SolutionBatch:
batch = SolutionBatch(self, popsize, device=grad_device)
distribution.sample(out=batch.access_values(), generator=self.generator)
self.evaluate(batch)
return batch
if num_interactions is None:
# If a `num_interactions` threshold is not given (i.e. is left as None), then we assume that an adaptive
# population is not desired.
# We therefore simply sample and evaluate a single SolutionBatch, and declare it as our main batch.
resulting_batch = sample_evaluated_batch()
else:
# If we have a `num_interactions` threshold, then we might have to sample more than one SolutionBatch
# (until `num_interactions` is reached).
# We start by defining a list (`batches`) which is to store all the batches we will sample.
batches = []
# We will have to count the number of all simulator interactions that we have encountered during the
# execution of this method. So, to count it correctly, we first get the interaction count that we already
# have before sampling and evaluating our new solutions.
interaction_count_at_first = self._get_local_interaction_count()
# Below is an inner function which returns how many simulator interactions we have done so far.
# It makes use of the variable `interaction_count_at_first` defined above.
def current_num_interactions() -> int:
return self._get_local_interaction_count() - interaction_count_at_first
# We also keep track of the total number of solutions.
# We might need this if there is a `popsize_max` threshold.
current_popsize = 0
# The main loop of the adaptive sampling.
while True:
# Sample and evaluate a new SolutionBatch, and add it to our batches list.
batches.append(sample_evaluated_batch())
# Increase our total population size by the size of the most recent batch.
current_popsize += popsize
if current_num_interactions() > num_interactions:
# If the number of interactions has reached or exceeded the `num_interactions` threshold,
# we exit the loop.
break
if (popsize_max is not None) and (current_popsize >= popsize_max):
# If we have `popsize_max` threshold and our total population size have reached or exceeded
# the `popsize_max` threshold, we exit the loop.
break
if len(batches) == 1:
# If we have only one batch in our batches list, that batch can be declared as our main batch.
resulting_batch = batches[0]
else:
# If we have multiple batches in our batches list, we concatenate all those batches and
# declare the result of the concatenation as our main batch.
resulting_batch = SolutionBatch.cat(batches)
# We take the solutions (`samples`) and the fitnesses from our main batch.
samples = resulting_batch.access_values(keep_evals=True)
fitnesses = resulting_batch.access_evals(obj_index)
# With the help of `samples` and `fitnesses`, we now compute our gradients.
grads = distribution.compute_gradients(
samples, fitnesses, objective_sense=self.senses[obj_index], ranking_method=ranking_method
)
if grad_device != self.device:
grads = cast_tensors_in_container(grads, device=self.device)
# Finally, we return the result, which is a dictionary containing the gradients and further information.
return {
"gradients": grads,
"num_solutions": len(resulting_batch),
"mean_eval": float(torch.mean(resulting_batch.access_evals(obj_index))),
}
def __copy__(self):
return self.clone()
def __deepcopy__(self, memo: Optional[dict]):
return self.clone(memo)
def is_on_cpu(self) -> bool:
"""
Whether or not the Problem object has its device set as "cpu".
"""
return str(self.device) == "cpu"
actor_index: Optional[int]
property
readonly
¶
Return the actor index if this is a remote worker. If this is not a remote worker, return None.
actors: Optional[list]
property
readonly
¶
Get the ray actors, if the Problem object is distributed. If the Problem object is not distributed and therefore has no actors, then, the result will be None.
after_eval_hook: Hook
property
readonly
¶
Get the Hook which stores the functions to call just after evaluating a SolutionBatch.
The functions to be stored in this hook are expected to accept one argument, that one argument being the SolutionBatch whose evaluation has just been completed.
The dictionaries returned by the functions in this hook are accumulated, and reported in the status dictionary of this problem object.
after_grad_hook: Hook
property
readonly
¶
Get the Hook which stores the functions to call just after
its sample_and_compute_gradients(...)
operation.
The functions to be stored in this hook are expected to accept one argument, that one argument being the gradients dictionary (which was produced by the Problem object, but not yet followed by the search algorithm).
The dictionaries returned by the functions in this hook are accumulated, and reported in the status dictionary of this problem object.
aux_device: Union[str, torch.device]
property
readonly
¶
Auxiliary device to help with the computations, most commonly for speeding up the solution evaluations.
An auxiliary device is different than the main device of the Problem
object (the main device being expressed by the device
property).
While the main device of the Problem object determines where the
solutions and the populations are stored (and also using which device
should a SearchAlgorithm instance communicate with the problem),
an auxiliary device is a device that might be used by the Problem
instance itself for its own computations (e.g. computations defined
within the methods _evaluate(...)
or _evaluate_batch(...)
).
If the problem's main device is something other than "cpu", that main device is also seen as the auxiliary device, and therefore returned by this property.
If the problem's main device is "cpu", then the auxiliary device
is decided as follows. If num_gpus_per_actor
of the Problem object
was set as "all" and if this instance is a remote instance, then the
auxiliary device is guessed as "cuda:N" where N is the actor index.
In all other cases, the auxiliary device is "cuda" if cuda is
available, and "cpu" otherwise.
before_eval_hook: Hook
property
readonly
¶
Get the Hook which stores the functions to call just before evaluating a SolutionBatch.
The functions to be stored in this hook are expected to accept one positional argument, that one argument being the SolutionBatch which is about to be evaluated.
before_grad_hook: Hook
property
readonly
¶
Get the Hook which stores the functions to call just before
its sample_and_compute_gradients(...)
operation.
device: Union[str, torch.device]
property
readonly
¶
device of the Problem object.
New solutions and populations will be generated in this device.
dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
dtype of the Problem object.
The decision variables of the optimization problem are of this dtype.
eval_data_length: int
property
readonly
¶
Length of the extra evaluation data vector for each solution.
eval_dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
evaluation dtype of the Problem object.
The evaluation results of the solutions are stored according to this dtype.
generator: Optional[torch._C.Generator]
property
readonly
¶
Random generator used by this Problem object.
Can also be None, which means that the Problem object will use the global random generator of PyTorch.
has_own_generator: bool
property
readonly
¶
Whether or not the Problem object has its own random generator.
If this is True, then the Problem object will use its own random generator when creating random values or tensors. If this is False, then the Problem object will use the global random generator when creating random values or tensors.
initial_lower_bounds: Optional[torch.Tensor]
property
readonly
¶
Initial lower bounds, for when initializing a new solution.
If such a bound was declared during the initialization phase, the returned value is a torch tensor (in the form of a vector or in the form of a scalar). If no such bound was declared, the returned value is None.
initial_upper_bounds: Optional[torch.Tensor]
property
readonly
¶
Initial upper bounds, for when initializing a new solution.
If such a bound was declared during the initialization phase, the returned value is a torch tensor (in the form of a vector or in the form of a scalar). If no such bound was declared, the returned value is None.
is_main: bool
property
readonly
¶
Returns True if this problem object lives in the main process and not in a remote actor. Otherwise, returns False.
is_multi_objective: bool
property
readonly
¶
Whether or not the problem is multi-objective
is_remote: bool
property
readonly
¶
Returns True if this problem object lives in a remote ray actor. Otherwise, returns False.
is_single_objective: bool
property
readonly
¶
Whether or not the problem is single-objective
lower_bounds: Optional[torch.Tensor]
property
readonly
¶
Lower bounds for the allowed values of a solution.
If such a bound was declared during the initialization phase, the returned value is a torch tensor (in the form of a vector or in the form of a scalar). If no such bound was declared, the returned value is None.
num_actors: int
property
readonly
¶
Number of actors (to be) used for parallelization. If the problem is configured for no parallelization, the result will be 0.
objective_sense: Union[str, Iterable[str]]
property
readonly
¶
Get the objective sense.
If the problem is single-objective, then a single string is returned. If the problem is multi-objective, then the objective senses will be returned in a list.
The returned string in the single-objective case, or each returned string in the multi-objective case, is "min" or "max".
remote_hook: Hook
property
readonly
¶
Get the Hook which stores the functions to call when this Problem object is (re)created on a remote actor.
The functions in this hook should expect one positional argument, that is the Problem object itself.
senses: Iterable[str]
property
readonly
¶
Get the objective senses.
The return value is a list of strings, each string being "min" or "max".
solution_length: Optional[int]
property
readonly
¶
Get the solution length.
Problems with dtype=None
do not have solution lengths.
For such problems, this property returns None.
status: dict
property
readonly
¶
Status dictionary of the problem object, updated after the last evaluation operation.
The dictionaries returned by the functions in after_eval_hook
are accumulated, and reported in this status dictionary.
stores_solution_stats: Optional[bool]
property
readonly
¶
Whether or not the best and worst solutions are kept.
upper_bounds: Optional[torch.Tensor]
property
readonly
¶
Upper bounds for the allowed values of a solution.
If such a bound was declared during the initialization phase, the returned value is a torch tensor (in the form of a vector or in the form of a scalar). If no such bound was declared, the returned value is None.
__init__(self, objective_sense, objective_func=None, *, initial_bounds=None, bounds=None, solution_length=None, dtype=None, eval_dtype=None, device=None, eval_data_length=None, seed=None, num_actors=None, actor_config=None, num_gpus_per_actor=None, num_subbatches=None, subbatch_size=None, store_solution_stats=None, vectorized=False)
special
¶
__init__(...)
: Initialize the Problem object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
objective_sense |
Union[str, Iterable[str]] |
A string, or a sequence of strings.
For a single-objective problem, a single string
("min" or "max", for minimization or maximization)
is enough.
For a problem with |
required |
initial_bounds |
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] |
In which interval will the values of a
new solution will be initialized.
Expected as a tuple, each element being either a
scalar, or a vector of length |
None |
bounds |
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] |
Interval in which all the solutions must always
reside.
Expected as a tuple, each element being either a
scalar, or a vector of length |
None |
solution_length |
Optional[int] |
Length of a solution.
Required for all fixed-length numeric optimization
problems.
For variable-length problems (which might or might not
be numeric), one is expected to leave |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
dtype (data type) of the data stored by a solution.
Can be given as a string (e.g. "float32"),
or as a numpy dtype (e.g. |
None |
eval_dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
dtype to be used for storing the evaluations
(or fitnesses, or scores, or costs, or losses)
of the solutions.
Can be given as a string (e.g. "float32"),
or as a numpy dtype (e.g. |
None |
device |
Union[str, torch.device] |
Default device in which a new population will be generated. For non-numeric problems, this must be "cpu". For numeric problems, this can be any device supported by PyTorch (e.g. "cuda"). |
None |
eval_data_length |
Optional[int] |
In addition to evaluation results (which are (un)fitnesses, or scores, or costs, or losses), each solution can store extra evaluation data. If storage of such extra evaluation data is required, one can set this argument to an integer bigger than 0. |
None |
seed |
Optional[int] |
Random seed to be used by the random number generator attached to the problem object. If left as None, no random number generator will be attached, and the global random number generator of PyTorch will be used instead. |
None |
num_actors |
Union[int, str] |
Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
There is also an option, "num_devices", which means that
both the numbers of CPUs and GPUs will be analyzed, and
new actors and GPUs for them will be allocated,
in a one-to-one mapping manner, if possible.
In more details, with |
None |
actor_config |
Optional[dict] |
A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass |
None |
num_gpus_per_actor |
Union[int, float, str] |
Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number |
None |
num_subbatches |
Optional[int] |
If |
None |
subbatch_size |
Optional[int] |
If |
None |
store_solution_stats |
Optional[bool] |
Whether or not the problem object should keep track of the best and worst solutions. Can also be left as None (which is the default behavior), in which case, it will store the best and worst solutions only when the first solution batch it encounters is on the cpu. This default behavior is to ensure that there is no transfer between the cpu and a foreign computation device (like the gpu) just for the sake of keeping the best and the worst solutions. |
None |
Source code in evotorch/core.py
def __init__(
self,
objective_sense: ObjectiveSense,
objective_func: Optional[Callable] = None,
*,
initial_bounds: Optional[BoundsPairLike] = None,
bounds: Optional[BoundsPairLike] = None,
solution_length: Optional[int] = None,
dtype: Optional[DType] = None,
eval_dtype: Optional[DType] = None,
device: Optional[Device] = None,
eval_data_length: Optional[int] = None,
seed: Optional[int] = None,
num_actors: Optional[Union[int, str]] = None,
actor_config: Optional[dict] = None,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
store_solution_stats: Optional[bool] = None,
vectorized: bool = False,
):
"""
`__init__(...)`: Initialize the Problem object.
Args:
objective_sense: A string, or a sequence of strings.
For a single-objective problem, a single string
("min" or "max", for minimization or maximization)
is enough.
For a problem with `n` objectives, a sequence
of strings, of length `n`, is required, each string
in the sequence being "min" or "max".
This argument specifies the goal of the optimization.
initial_bounds: In which interval will the values of a
new solution will be initialized.
Expected as a tuple, each element being either a
scalar, or a vector of length `n`, `n` being the
length of a solution.
If a manual solution initialization is preferred
(instead of an interval-based initialization),
one can leave `initial_bounds` as None, and override
the `generate_values(...)` method instead in the
inheriting subclass.
bounds: Interval in which all the solutions must always
reside.
Expected as a tuple, each element being either a
scalar, or a vector of length `n`, `n` being the
length of a solution.
This argument is optional, and can be left as None
if one does not wish to declare hard bounds on the
decision values of the problem.
If `bounds` is specified, `initial_bounds` is missing,
and `generate_values(...)` is not overriden, then
`bounds` will also serve as the `initial_bounds`.
solution_length: Length of a solution.
Required for all fixed-length numeric optimization
problems.
For variable-length problems (which might or might not
be numeric), one is expected to leave `solution_length`
as None, and declare `dtype` as `object`.
dtype: dtype (data type) of the data stored by a solution.
Can be given as a string (e.g. "float32"),
or as a numpy dtype (e.g. `numpy.dtype("float32")`),
or as a PyTorch dtype (e.g. `torch.float32`).
Alternatively, if the problem is variable-length
and/or non-numeric, one is expected to declare `dtype`
as `object`.
eval_dtype: dtype to be used for storing the evaluations
(or fitnesses, or scores, or costs, or losses)
of the solutions.
Can be given as a string (e.g. "float32"),
or as a numpy dtype (e.g. `numpy.dtype("float32")`),
or as a PyTorch dtype (e.g. `torch.float32`).
`eval_dtype` must always refer to a "float" data type,
therefore, `object` is not accepted as a valid `eval_dtype`.
If `eval_dtype` is not specified (i.e. left as None),
then the following actions are taken to determine the
`eval_dtype`:
if `dtype` is "float16", `eval_dtype` becomes "float16";
if `dtype` is "bfloat16", `eval_dtype` becomes "bfloat16";
if `dtype` is "float32", `eval_dtype` becomes "float32";
if `dtype` is "float64", `eval_dtype` becomes "float64";
and for any other `dtype`, `eval_dtype` becomes "float32".
device: Default device in which a new population will be
generated. For non-numeric problems, this must be "cpu".
For numeric problems, this can be any device supported
by PyTorch (e.g. "cuda").
eval_data_length: In addition to evaluation results
(which are (un)fitnesses, or scores, or costs, or losses),
each solution can store extra evaluation data.
If storage of such extra evaluation data is required,
one can set this argument to an integer bigger than 0.
seed: Random seed to be used by the random number generator
attached to the problem object.
If left as None, no random number generator will be
attached, and the global random number generator of
PyTorch will be used instead.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
There is also an option, "num_devices", which means that
both the numbers of CPUs and GPUs will be analyzed, and
new actors and GPUs for them will be allocated,
in a one-to-one mapping manner, if possible.
In more details, with `num_actors="num_devices"`, if
`device` is given as a GPU device, then it will be inferred
that the user wishes to put everything (including the
population) on a single GPU, and therefore there won't be
any allocation of actors nor GPUs.
With `num_actors="num_devices"` and with `device` set as
"cpu" (or as left as None), if there are multiple CPUs
and multiple GPUs, then `n` actors will be allocated
where `n` is the minimum among the number of CPUs
and the number of GPUs, so that there can be one-to-one
mapping between CPUs and GPUs (i.e. such that each actor
can be assigned an entire GPU).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many sub-batches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a sub-batch (or sub-population) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
store_solution_stats: Whether or not the problem object should
keep track of the best and worst solutions.
Can also be left as None (which is the default behavior),
in which case, it will store the best and worst solutions
only when the first solution batch it encounters is on the
cpu. This default behavior is to ensure that there is no
transfer between the cpu and a foreign computation device
(like the gpu) just for the sake of keeping the best and
the worst solutions.
"""
# Set the dtype for the decision variables of the Problem
if dtype is None:
self._dtype = torch.float32
elif is_dtype_object(dtype):
self._dtype = object
else:
self._dtype = to_torch_dtype(dtype)
# Set the dtype for the solution evaluations (i.e. fitnesses and evaluation data)
if eval_dtype is not None:
# If an `eval_dtype` is explicitly stated, then accept it as the `_eval_dtype` of the Problem
self._eval_dtype = to_torch_dtype(eval_dtype)
else:
# This is the case where an `eval_dtype` is not explicitly stated by the user.
# We need to choose a default.
if self._dtype in (torch.float16, torch.bfloat16, torch.float64):
# If the `dtype` of the problem is a non-32-bit float type (i.e. float16, bfloat16, float64)
# then we use that as our `_eval_dtype` as well.
self._eval_dtype = self._dtype
else:
# For any other `dtype`, we use float32 as our `_eval_dtype`.
self._eval_dtype = torch.float32
# Set the main device of the Problem object
self._device = torch.device("cpu") if device is None else torch.device(device)
# Declare the internal variable that might store the random number generator
self._generator: Optional[torch.Generator] = None
# Set the seed of the Problem object, if a seed is provided
self.manual_seed(seed)
# Declare the internal variables that will store the bounds and the solution length
self._initial_lower_bounds: Optional[torch.Tensor] = None
self._initial_upper_bounds: Optional[torch.Tensor] = None
self._lower_bounds: Optional[torch.Tensor] = None
self._upper_bounds: Optional[torch.Tensor] = None
self._solution_length: Optional[int] = None
if self._dtype is object:
# If dtype is given as `object`, then there are some runtime sanity checks to perform
if bounds is not None or initial_bounds is not None:
# With dtype as object, if bounds are given then we raise an error.
# This is because the `object` dtype implies that the decision values are not necessarily numeric,
# and therefore, we cannot have the guarantee of satisfying numeric bounds.
raise ValueError(
f"With dtype as {repr(dtype)}, expected to receive `initial_bounds` and/or `bounds` as None."
f" However, one or both of them is/are set as value(s) other than None."
)
if solution_length is not None:
# With dtype as object, if `solution_length` is provided, then we raise an error.
# This is because the `object` dtype implies that the solutions can be expressed via various
# containers, each with its own length, and therefore, a fixed solution length cannot be guaranteed.
raise ValueError(
f"With dtype as {repr(dtype)}, expected to receive `solution_length` as None."
f" However, received `solution_length` as {repr(solution_length)}."
)
if str(self._device) != "cpu":
# With dtype as object, if `device` is something other than "cpu", then we raise an error.
# This is because the `object` dtype implies that the decision values are stored by an ObjectArray,
# whose device is always "cpu".
raise ValueError(
f"With dtype as {repr(dtype)}, expected to receive `device` as 'cpu'."
f" However, received `device` as {repr(device)}."
)
else:
# If dtype is something other than `object`, then we need to properly store the given numeric bounds,
# and also perform some sanity checks.
initbnd_tuple_name = "initial_bounds"
bnd_tuple_name = "bounds"
if (bounds is None) and (initial_bounds is None):
# With a numeric dtype, if no boundary is provided at all, then we cannot know how to initialize
# the solutions. With such a lack of information, we raise an error.
raise ValueError(
f"Together with a numeric dtype ({repr(dtype)}),"
f" expected to receive `initial_bounds` and/or `bounds` as something other than None."
f" However, both `initial_bounds` and `bounds` are None."
)
elif (bounds is not None) and (initial_bounds is None):
# With a numeric dtype, if strict bounds are given but initial bounds are not given, then we assume
# that the strict bounds also serve as the initial bounds.
# Therefore, we take clones of the strict bounds and use this clones as the initial bounds.
initial_bounds = clone(bounds)
initbnd_tuple_name = "bounds"
if solution_length is None:
# With a numeric dtype, if solution length is missing, then we raise an error.
raise ValueError(
f"Together with a numeric dtype ({repr(dtype)}),"
f" expected to receive `solution_length` as an integer."
f" However, `solution_length` is None."
)
else:
# With a numeric dtype, if a solution length is provided, we make sure that it is integer.
solution_length = int(solution_length)
# Store the solution length
self._solution_length = solution_length
# Below is an internal helper function for some common operations for the (strict) bounds
# and for the initial bounds.
def process_bounds(bounds_tuple: BoundsPairLike, tuple_name: str) -> BoundsPair:
# This function receives the bounds_tuple (a tuple containing lower and upper bounds),
# and the string name of the bounds argument ("bounds" or "initial_bounds").
# What is returned is the bounds expressed as PyTorch tensors in the correct dtype and device.
nonlocal solution_length
# Extract the lower and upper bounds from the received bounds tuple.
lb, ub = bounds_tuple
# Make sure that the lower and upper bounds are expressed as tensors of correct dtype and device.
lb = self.make_tensor(lb)
ub = self.make_tensor(ub)
for bound_array in (lb, ub): # For each boundary tensor (lb and ub)
if bound_array.ndim not in (0, 1):
# If the boundary tensor is not as scalar and is not a 1-dimensional vector, then raise an
# error.
raise ValueError(
f"Lower and upper bounds are expected as scalars or as 1-dimensional vectors."
f" However, these given boundaries have incompatible shape:"
f" {bound_array} (of shape {bound_array.shape})."
)
if bound_array.ndim == 1:
if len(bound_array) != solution_length:
# In the case where the boundary tensor is a 1-dimensional vector, if this vector's length
# is not equal to the solution length, then we raise an error.
raise ValueError(
f"When boundaries are expressed as 1-dimensional vectors, their length are"
f" expected as the solution length of the Problem object."
f" However, while the problem's solution length is {solution_length},"
f" these given boundaries have incompatible length:"
f" {bound_array} (of length {len(bound_array)})."
)
# Return the processed forms of the lower and upper boundary tensors.
return lb, ub
# Process the initial bounds with the help of the internal function `process_bounds(...)`
init_lb, init_ub = process_bounds(initial_bounds, initbnd_tuple_name)
# Store the processed initial bounds
self._initial_lower_bounds = init_lb
self._initial_upper_bounds = init_ub
if bounds is not None:
# If there are strict bounds, then process those bounds with the help of `process_bounds(...)`.
lb, ub = process_bounds(bounds, bnd_tuple_name)
# Store the processed bounds
self._lower_bounds = lb
self._upper_bounds = ub
# Annotate the variable that will store the objective sense(s) of the problem
self._objective_sense: ObjectiveSense
# Below is an internal function which makes sure that a provided objective sense has a valid value
# (where valid values are "min" or "max")
def validate_sense(s: str):
if s not in ("min", "max"):
raise ValueError(
f"Invalid objective sense: {repr(s)}."
f"Instead, please provide the objective sense as 'min' or 'max'."
)
if not is_sequence(objective_sense):
# If the provided objective sense is not a sequence, then convert it to a single-element list
senses = [objective_sense]
num_senses = 1
else:
# If the provided objective sense is a sequence, then take a list copy of it
senses = list(objective_sense)
num_senses = len(objective_sense)
# Ensure that each provided objective sense is valid
for sense in senses:
validate_sense(sense)
if num_senses == 0:
# If the given sequence of objective senses is empty, then we raise an error.
raise ValueError(
"Encountered an empty sequence via `objective_sense`."
" For a single-objective problem, please set `objective_sense` as 'min' or 'max'."
" For a multi-objective problem, please set `objective_sense` as a sequence,"
" each element being 'min' or 'max'."
)
# Store the objective senses
self._senses: Iterable[str] = senses
# Store the provided objective function (which can be None)
self._objective_func: Optional[Callable] = objective_func
# Store the information which indicates whether or not the given objective function is vectorized
self._vectorized: bool = bool(vectorized)
# If the evaluation data length is explicitly stated, then convert it to an integer and store it.
# Otherwise, store the evaluation data length as 0.
self._eval_data_length = 0 if eval_data_length is None else int(eval_data_length)
# Initialize the actor index.
# If the problem is configured to be parallelized and the parallelization is triggered, then each remote
# copy will have a different integer value for `_actor_index`.
self._actor_index: Optional[int] = None
# Initialize the variable that might store the list of actors as None.
# If the problem is configured to be parallelized and the parallelization is triggered, then this variable
# will store references to the remote actors (each remote actor storing its own copy of this Problem
# instance).
self._actors: Optional[list] = None
# Initialize the variable that might store the ray ActorPool.
# If the problem is configured to be parallelized and the parallelization is triggered, then this variable
# will store the ray ActorPool that is generated out of the remote actors.
self._actor_pool: Optional[ActorPool] = None
# Store the ray actor configuration dictionary provided by the user (if any).
# When (or if) the parallelization is triggered, each actor will be created with this given configuration.
self._actor_config: Optional[dict] = None if actor_config is None else deepcopy(dict(actor_config))
# If given, store the sub-batch size or number of sub-batches.
# When the problem is parallelized, a sub-batch size determines the maximum size for a SolutionBatch
# that will be sent to a remote actor for parallel solution evaluation.
# Alternatively, num_subbatches determines into how many pieces will a SolutionBatch be split
# for parallelization.
# If both are None, then the main SolutionBatch will be split among the actors.
if (num_subbatches is not None) and (subbatch_size is not None):
raise ValueError(
f"Encountered both `num_subbatches` and `subbatch_size` as values other than None."
f" num_subbatches={num_subbatches}, subbatch_size={subbatch_size}."
f" Having both of them as values other than None cannot be accepted."
)
self._num_subbatches: Optional[int] = None if num_subbatches is None else int(num_subbatches)
self._subbatch_size: Optional[int] = None if subbatch_size is None else int(subbatch_size)
# Initialize the additional states to be loaded by the remote actor as None.
# If there are such additional states for remote actors, the inheriting class can fill this as a list
# of dictionaries.
self._remote_states: Optional[Iterable[dict]] = None
# Initialize a temporary internal variable which stores the resources available in the ray cluster.
# Most probably, we are interested in the resources "CPU" and "GPU".
ray_resources: Optional[dict] = None
# The following is an internal helper function which returns the amount of availability for a given
# resource in the ray cluster.
# If the requested resource is not available at all, None will be returned.
def get_ray_resource(resource_name: str) -> Any:
# Ensure that the ray cluster is initialized
ensure_ray()
nonlocal ray_resources
if ray_resources is None:
# If the ray resource information was not fetched, then fetch them and store them.
ray_resources = ray.available_resources()
# Return the information regarding the requested resource from the fetched resource information.
# If it turns out that the requested resource is not available at all, the result will be None.
return ray_resources.get(resource_name, None)
# Annotate the variable that will store the number of actors (to be created when the parallelization
# is triggered).
self._num_actors: int
if num_actors is None:
# If the argument `num_actors` is left as None, then we set `_num_actors` as 0, which means that
# there will be no parallelization.
self._num_actors = 0
elif isinstance(num_actors, str):
# This is the case where `num_actors` has a string value
if num_actors in ("max", "num_cpus"):
# If the `num_actors` argument was given as "max" or as "num_cpus", then we first read how many CPUs
# are available in the ray cluster, then convert it to integer (via computing its ceil value), and
# finally set `_num_actors` as this integer.
self._num_actors = math.ceil(get_ray_resource("CPU"))
elif num_actors == "num_gpus":
# If the `num_actors` argument was given as "num_gpus", then we first read how many GPUs are
# available in the ray cluster.
num_gpus = get_ray_resource("GPU")
if num_gpus is None:
# If there are no GPUs at all, then we raise an error
raise ValueError(
"The argument `num_actors` was encountered as 'num_gpus'."
" However, there does not seem to be any GPU available."
)
if num_gpus < 1e-4:
# If the number of available GPUs are 0 or close to 0, then we raise an error
raise ValueError(
f"The argument `num_actors` was encountered as 'num_gpus'."
f" However, the number of available GPUs are either 0 or close to 0 (= {num_gpus})."
)
if (actor_config is not None) and ("num_gpus" in actor_config):
# With `num_actors` argument given as "num_gpus", we will also allocate each GPU to an actor.
# If `actor_config` contains an item with key "num_gpus", then that configuration item would
# conflict with the GPU allocation we are about to do here.
# So, we raise an error.
raise ValueError(
"The argument `num_actors` was encountered as 'num_gpus'."
" With this configuration, the number of GPUs assigned to an actor is automatically determined."
" However, at the same time, the `actor_config` argument was received with the key 'num_gpus',"
" which causes a conflict."
)
if num_gpus_per_actor is not None:
# With `num_actors` argument given as "num_gpus", we will also allocate each GPU to an actor.
# If the argument `num_gpus_per_actor` is also stated, then such a configuration item would
# conflict with the GPU allocation we are about to do here.
# So, we raise an error.
raise ValueError(
f"The argument `num_actors` was encountered as 'num_gpus'."
f" With this configuration, the number of GPUs assigned to an actor is automatically determined."
f" However, at the same time, the `num_gpus_per_actor` argument was received with a value other"
f" than None ({repr(num_gpus_per_actor)}), which causes a conflict."
)
# Set the number of actors as the ceiled integer counterpart of the number of available GPUs
self._num_actors = math.ceil(num_gpus)
# We assign a GPU for each actor (by overriding the value for the argument `num_gpus_per_actor`).
num_gpus_per_actor = num_gpus / self._num_actors
elif num_actors == "num_devices":
# This is the case where `num_actors` has the string value "num_devices".
# With `num_actors` set as "num_devices", if there are any GPUs, the behavior is to assign a GPU
# to each actor. If there are conflicting configurations regarding how many GPUs are to be assigned
# to each actor, then we raise an error.
if (actor_config is not None) and ("num_gpus" in actor_config):
raise ValueError(
"The argument `num_actors` was encountered as 'num_devices'."
" With this configuration, the number of GPUs assigned to an actor is automatically determined."
" However, at the same time, the `actor_config` argument was received with the key 'num_gpus',"
" which causes a conflict."
)
if num_gpus_per_actor is not None:
raise ValueError(
f"The argument `num_actors` was encountered as 'num_devices'."
f" With this configuration, the number of GPUs assigned to an actor is automatically determined."
f" However, at the same time, the `num_gpus_per_actor` argument was received with a value other"
f" than None ({repr(num_gpus_per_actor)}), which causes a conflict."
)
if self._device != torch.device("cpu"):
# If the main device is not CPU, then the user most probably wishes to put all the
# computations (both evaluations and the population) on the GPU, without allocating
# any actor.
# So, we set `_num_actors` as None, and overwrite `num_gpus_per_actor` with None.
self._num_actors = None
num_gpus_per_actor = None
else:
# If the device argument is "cpu" or left as None, then we assume that actor allocations
# might be desired.
# Read how many CPUs and GPUs are available in the ray cluster.
num_cpus = get_ray_resource("CPU")
num_gpus = get_ray_resource("GPU")
# If we have multiple CPUs, then we continue with the actor allocation procedures.
if (num_gpus is None) or (num_gpus < 1e-4):
# If there are no GPUs, then we set the number of actors as the number of CPUs, and we
# set the number of GPUs per actor as None (which means that there will be no GPU
# assignment)
self._num_actors = math.ceil(num_cpus)
num_gpus_per_actor = None
else:
# If there are GPUs available, then we compute the minimum among the number of CPUs and
# GPUs, and this minimum value becomes the number of actors (so that there can be
# one-to-one mapping between actors and GPUs).
self._num_actors = math.ceil(min(num_cpus, num_gpus))
# We assign a GPU for each actor (by overriding the value for the argument
# `num_gpus_per_actor`).
if self._num_actors <= num_gpus:
num_gpus_per_actor = 1
else:
num_gpus_per_actor = num_gpus / self._num_actors
else:
# This is the case where `num_actors` is given as an unexpected string. We raise an error here.
raise ValueError(
f"Invalid string value for `num_actors`: {repr(num_actors)}."
f" The acceptable string values for `num_actors` are 'max', 'num_cpus', 'num_gpus', 'num_devices'."
)
else:
# This is the case where `num_actors` has a value which is not a string.
# In this case, we make sure that the given value is an integer, and then use this integer as our
# number of actors.
self._num_actors = int(num_actors)
if self._num_actors == 1:
# Creating a single actor does not bring any benefit of parallelization.
# Therefore, at the end of all the computations above regarding the number of actors, if it turns out
# that the target number of actors is 1, we reduce it to 0 (meaning that no actor will be initialized).
self._num_actors = 0
# Since we are to allocate no actor, the value of the argument `num_gpus_per_actor` is meaningless.
# We therefore overwrite the value of that argument with None.
num_gpus_per_actor = None
# Annotate the variable which will determine how many GPUs are to be assigned to each actor.
self._num_gpus_per_actor: Optional[Union[str, int, float]]
if (actor_config is not None) and ("num_gpus" in actor_config) and (num_gpus_per_actor is not None):
# If `actor_config` dictionary has the item "num_gpus" and also `num_gpus_per_actor` is not None,
# then there is a conflicting (or redundant) configuration. We raise an error here.
raise ValueError(
'The `actor_config` dictionary contains the key "num_gpus".'
" At the same time, `num_gpus_per_actor` has a value other than None."
" These two configurations are conflicting."
" Please specify the number of GPUs per actor either via the `actor_config` dictionary,"
" or via the `num_gpus_per_actor` argument, but not via both."
)
if num_gpus_per_actor is None:
# If the argument `num_gpus_per_actor` is not specified, then we set the attribute
# `_num_gpus_per_actor` as None, which means that no GPUs will be assigned to the actors.
self._num_gpus_per_actor = None
elif isinstance(num_gpus_per_actor, str):
# This is the case where `num_gpus_per_actor` is given as a string.
if num_gpus_per_actor == "max":
# This is the case where `num_gpus_per_actor` is given as "max".
num_gpus = get_ray_resource("GPU")
if num_gpus is None:
# With `num_gpus_per_actor` as "max", if there is no GPU available, then we set the attribute
# `_num_gpus_per_actor` as None, which means there will be no GPU assignment to the actors.
self._num_gpus_per_actor = None
else:
# With `num_gpus_per_actor` as "max", if there are GPUs available, then the available GPUs will
# be shared among the actors.
self._num_gpus_per_actor = num_gpus / self._num_actors
elif num_gpus_per_actor == "all":
# When `num_gpus_per_actor` is "all", we also set the attribute `_num_gpus_per_actor` as "all".
# When a remote actor is initialized, the remote actor will see that the Problem instance has its
# `_num_gpus_per_actor` set as "all", and it will remove the environment variable named
# "CUDA_VISIBLE_DEVICES" in its own environment.
# With "CUDA_VISIBLE_DEVICES" removed, an actor will see all the GPUs available in its own
# environment.
self._num_gpus_per_actor = "all"
else:
# This is the case where `num_gpus_per_actor` argument has an unexpected string value.
# We raise an error.
raise ValueError(
f"Invalid string value for `num_gpus_per_actor`: {repr(num_gpus_per_actor)}."
f' Acceptable string values for `num_gpus_per_actor` are: "max", "all".'
)
elif isinstance(num_gpus_per_actor, int):
# When the argument `num_gpus_per_actor` is set as an integer we just set the attribute
# `_num_gpus_per_actor` as this integer.
self._num_gpus_per_actor = num_gpus_per_actor
else:
# For anything else, we assume that `num_gpus_per_actor` is an object that is convertible to float.
# Therefore, we convert it to float and store it in the attribute `_num_gpus_per_actor`.
# Also, remember that, when `num_actors` is given as "num_gpus" or as "num_devices",
# the code above overrides the value for the argument `num_gpus_per_actor`, which means,
# this is the case that is activated when `num_actors` is "num_gpus" or "num_devices".
self._num_gpus_per_actor = float(num_gpus_per_actor)
# Initialize the Hook instances (and the related status dictionary for the `_after_eval_hook`)
self._before_eval_hook: Hook = Hook()
self._after_eval_hook: Hook = Hook([self._get_best_and_worst])
self._after_eval_status: dict = {}
self._remote_hook: Hook = Hook()
self._before_grad_hook: Hook = Hook()
self._after_grad_hook: Hook = Hook()
# Initialize various stats regarding the solutions encountered by this Problem instance.
self._store_solution_stats = None if store_solution_stats is None else bool(store_solution_stats)
self._best: Optional[list] = None
self._worst: Optional[list] = None
self._best_evals: Optional[torch.Tensor] = None
self._worst_evals: Optional[torch.Tensor] = None
# Initialize the boolean attribute which indicates whether or not this Problem instance (which can be
# the main instance or a remote instance on an actor) is "prepared" via the `_prepare` method.
self._prepared: bool = False
all_remote_envs(self)
¶
Get an accessor which is used for running a method on all remote reinforcement learning environments.
This method can only be used on parallelized Problem
objects which have their get_env()
methods defined.
For example, one can use this feature on a parallelized
GymProblem.
As an example, let us consider a parallelized GymProblem
object named my_problem
. Given that my_problem
has
n
remote actors, a method f()
can be executed
on all remote reinforcement learning environments as
follows:
results = my_problem.all_remote_envs().f()
The variable results
is a list of length n
, the i-th
item of the list belonging to the method f's result
from the i-th actor.
Returns:
Type | Description |
---|---|
AllRemoteEnvs |
A method accessor for all the remote reinforcement learning environments. |
Source code in evotorch/core.py
def all_remote_envs(self) -> AllRemoteEnvs:
"""
Get an accessor which is used for running a method
on all remote reinforcement learning environments.
This method can only be used on parallelized Problem
objects which have their `get_env()` methods defined.
For example, one can use this feature on a parallelized
GymProblem.
As an example, let us consider a parallelized GymProblem
object named `my_problem`. Given that `my_problem` has
`n` remote actors, a method `f()` can be executed
on all remote reinforcement learning environments as
follows:
results = my_problem.all_remote_envs().f()
The variable `results` is a list of length `n`, the i-th
item of the list belonging to the method f's result
from the i-th actor.
Returns:
A method accessor for all the remote reinforcement
learning environments.
"""
self._parallelize()
if self.is_remote:
raise RuntimeError(
"The method `all_remote_envs()` can only be used on the main (i.e. non-remote)"
" Problem instance."
" However, this Problem instance is on a remote actor."
)
return AllRemoteEnvs(self._actors)
all_remote_problems(self)
¶
Get an accessor which is used for running a method on all remote clones of this Problem object.
For example, given a Problem object named my_problem
,
also assuming that this Problem object is parallelized,
and therefore has n
remote actors, a method f()
can be executed on all the remote instances as follows:
results = my_problem.all_remote_problems().f()
The variable results
is a list of length n
, the i-th
item of the list belonging to the method f's result
from the i-th actor.
Returns:
Type | Description |
---|---|
AllRemoteProblems |
A method accessor for all the remote Problem objects. |
Source code in evotorch/core.py
def all_remote_problems(self) -> AllRemoteProblems:
"""
Get an accessor which is used for running a method
on all remote clones of this Problem object.
For example, given a Problem object named `my_problem`,
also assuming that this Problem object is parallelized,
and therefore has `n` remote actors, a method `f()`
can be executed on all the remote instances as follows:
results = my_problem.all_remote_problems().f()
The variable `results` is a list of length `n`, the i-th
item of the list belonging to the method f's result
from the i-th actor.
Returns:
A method accessor for all the remote Problem objects.
"""
self._parallelize()
if self.is_remote:
raise RuntimeError(
"The method `all_remote_problems()` can only be used on the main (i.e. non-remote)"
" Problem instance."
" However, this Problem instance is on a remote actor."
)
return AllRemoteProblems(self._actors)
clone(self, memo=None)
¶
compare_solutions(self, a, b, obj_index=None)
¶
Compare two solutions. It is assumed that both solutions are already evaluated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
Solution |
The first solution. |
required |
b |
Solution |
The second solution. |
required |
obj_index |
Optional[int] |
The objective index according to which the comparison will be made. Can be left as None if the problem is single-objective. |
None |
Returns:
Type | Description |
---|---|
float |
A positive number if |
Source code in evotorch/core.py
def compare_solutions(self, a: "Solution", b: "Solution", obj_index: Optional[int] = None) -> float:
"""
Compare two solutions.
It is assumed that both solutions are already evaluated.
Args:
a: The first solution.
b: The second solution.
obj_index: The objective index according to which the comparison
will be made.
Can be left as None if the problem is single-objective.
Returns:
A positive number if `a` is better;
a negative number if `b` is better;
0 if there is a tie.
"""
senses = self.senses
obj_index = self.normalize_obj_index(obj_index)
sense = senses[obj_index]
def score(s: Solution):
return s.evals[obj_index]
if sense == "max":
return score(a) - score(b)
elif sense == "min":
return score(b) - score(a)
else:
raise ValueError("Unrecognized sense: " + repr(sense))
ensure_numeric(self)
¶
Ensure that the problem has a numeric dtype.
Exceptions:
Type | Description |
---|---|
ValueError |
if the problem has a non-numeric dtype. |
ensure_single_objective(self)
¶
Ensure that the problem has only one objective.
Exceptions:
Type | Description |
---|---|
ValueError |
if the problem is multi-objective. |
Source code in evotorch/core.py
ensure_unbounded(self)
¶
Ensure that the problem has no strict lower and upper bounds.
Exceptions:
Type | Description |
---|---|
ValueError |
if the problem has strict lower and upper bounds. |
Source code in evotorch/core.py
def ensure_unbounded(self):
"""
Ensure that the problem has no strict lower and upper bounds.
Raises:
ValueError: if the problem has strict lower and upper bounds.
"""
if not (self.lower_bounds is None and self.upper_bounds is None):
raise ValueError("Expected an unbounded problem, but this problem has lower and/or upper bounds.")
evaluate(self, x)
¶
Evaluate the given Solution or SolutionBatch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
The SolutionBatch to be evaluated. |
required |
Source code in evotorch/core.py
def evaluate(self, x: Union["SolutionBatch", "Solution"]):
"""
Evaluate the given Solution or SolutionBatch.
Args:
batch: The SolutionBatch to be evaluated.
"""
if isinstance(x, Solution):
batch = x.to_batch()
elif isinstance(x, SolutionBatch):
batch = x
else:
raise TypeError(
f"The method `evaluate(...)` expected a Solution or a SolutionBatch as its argument."
f" However, the received object is {repr(x)}, which is of type {repr(type(x))}."
)
self._parallelize()
if self.is_main:
self.before_eval_hook(batch)
must_sync_after = self._sync_before()
self._start_preparations()
self._evaluate_all(batch)
if must_sync_after:
self._sync_after()
if self.is_main:
self._after_eval_status = self.after_eval_hook.accumulate_dict(batch)
generate_batch(self, popsize=None, *, empty=False, center=None, stdev=None, symmetric=False)
¶
Generate a new SolutionBatch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
popsize |
Optional[int] |
Number of solutions that will be contained in the new batch. |
None |
empty |
bool |
Set this as True if you would like to receive the solutions un-initialized. |
False |
center |
Union[float, Iterable[float], torch.Tensor] |
Center point of the Gaussian distribution from which
the decision values will be sampled, as a scalar or as a
1-dimensional vector.
Can also be left as None.
If |
None |
stdev |
Union[float, Iterable[float], torch.Tensor] |
Can be None (default) if the SolutionBatch is to contain
decision values sampled from the interval specified by
|
None |
symmetric |
bool |
To be used only when |
False |
Source code in evotorch/core.py
def generate_batch(
self,
popsize: Optional[int] = None,
*,
empty: bool = False,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
symmetric: bool = False,
) -> "SolutionBatch":
"""
Generate a new SolutionBatch.
Args:
popsize: Number of solutions that will be contained in the new
batch.
empty: Set this as True if you would like to receive the solutions
un-initialized.
center: Center point of the Gaussian distribution from which
the decision values will be sampled, as a scalar or as a
1-dimensional vector.
Can also be left as None.
If `center` is None and `stdev` is None, all the decision
values will be sampled from the interval specified by
`initial_bounds` (or by `bounds` if `initial_bounds` was not
specified).
If `center` is None and `stdev` is not None, a center point
will be sampled from within the interval specified by
`initial_bounds` or `bounds`, and the decision values will be
sampled from a Gaussian distribution around this center point.
stdev: Can be None (default) if the SolutionBatch is to contain
decision values sampled from the interval specified by
`initial_bounds` (or by `bounds` if `initial_bounds` was not
provided during the initialization phase).
Alternatively, a scalar or a 1-dimensional vector specifying
the standard deviation of the Gaussian distribution from which
the decision values will be sampled.
symmetric: To be used only when `stdev` is not None.
If `symmetric` is True, decision values will be sampled from
the Gaussian distribution in a symmetric (i.e. antithetic)
manner.
Otherwise, the decision values will be sampled in the
non-antithetic manner.
"""
if (center is None) and (stdev is None):
if symmetric:
raise ValueError(
f"The argument `symmetric` can be set as True only when `center` and `stdev` are provided."
f" Although `center` and `stdev` are None, `symmetric` was received as {symmetric}."
)
return SolutionBatch(self, popsize, empty=empty, device=self.device)
elif (center is not None) and (stdev is not None):
if empty:
raise ValueError(
f"When `center` and `stdev` are provided, the argument `empty` must be False."
f" However, the received value for `empty` is {empty}."
)
result = SolutionBatch(self, popsize, device=self.device, empty=True)
self.make_gaussian(out=result.access_values(), center=center, stdev=stdev, symmetric=symmetric)
else:
raise ValueError(
f"The arguments `center` and `stdev` were expected to be None or non-None at the same time."
f" Received `center`: {center}."
f" Received `stdev`: {stdev}."
)
generate_values(self, num_solutions)
¶
Generate decision values.
This function returns a tensor containing the decision values
for n
new solutions, n
being the integer passed as the num_rows
argument.
For numeric problems, this function generates the decision values
which respect initial_bounds
(or bounds
, if initial_bounds
was not provided).
If this type of initialization is not desired, one can override
this function and define a manual initialization scheme in the
inheriting subclass.
For non-numeric problems, it is expected that the inheriting subclass
will override the method _fill(...)
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_solutions |
int |
For how many solutions will new decision values be generated. |
required |
Returns:
Type | Description |
---|---|
Union[torch.Tensor, evotorch.tools.objectarray.ObjectArray] |
A PyTorch tensor for numeric problems, an ObjectArray for non-numeric problems. |
Source code in evotorch/core.py
def generate_values(self, num_solutions: int) -> Union[torch.Tensor, ObjectArray]:
"""
Generate decision values.
This function returns a tensor containing the decision values
for `n` new solutions, `n` being the integer passed as the `num_rows`
argument.
For numeric problems, this function generates the decision values
which respect `initial_bounds` (or `bounds`, if `initial_bounds`
was not provided).
If this type of initialization is not desired, one can override
this function and define a manual initialization scheme in the
inheriting subclass.
For non-numeric problems, it is expected that the inheriting subclass
will override the method `_fill(...)`.
Args:
num_solutions: For how many solutions will new decision values be
generated.
Returns:
A PyTorch tensor for numeric problems, an ObjectArray for
non-numeric problems.
"""
result = self.make_empty(num_solutions=num_solutions)
self._fill(result)
return result
get_obj_order_descending(self)
¶
When sorting the solutions from best to worst according to each objective i, is the ordering descending?
Source code in evotorch/core.py
def get_obj_order_descending(self) -> Iterable[bool]:
"""When sorting the solutions from best to worst according to each objective i, is the ordering descending?"""
result = []
for s in self.senses:
if s == "min":
result.append(False)
elif s == "max":
result.append(True)
else:
raise ValueError(f"Invalid sense: {repr(s)}")
return result
is_better(self, a, b, obj_index=None)
¶
Check whether or not the first solution is better. It is assumed that both solutions are already evaluated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
Solution |
The first solution. |
required |
b |
Solution |
The second solution. |
required |
obj_index |
Optional[int] |
The objective index according to which the comparison will be made. Can be left as None if the problem is single-objective. |
None |
Returns:
Type | Description |
---|---|
bool |
True if |
Source code in evotorch/core.py
def is_better(self, a: "Solution", b: "Solution", obj_index: Optional[int] = None) -> bool:
"""
Check whether or not the first solution is better.
It is assumed that both solutions are already evaluated.
Args:
a: The first solution.
b: The second solution.
obj_index: The objective index according to which the comparison
will be made.
Can be left as None if the problem is single-objective.
Returns:
True if `a` is better; False otherwise.
"""
return self.compare_solutions(a, b, obj_index) > 0
is_on_cpu(self)
¶
is_worse(self, a, b, obj_index=None)
¶
Check whether or not the first solution is worse. It is assumed that both solutions are already evaluated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
Solution |
The first solution. |
required |
b |
Solution |
The second solution. |
required |
obj_index |
Optional[int] |
The objective index according to which the comparison will be made. Can be left as None if the problem is single-objective. |
None |
Returns:
Type | Description |
---|---|
bool |
True if |
Source code in evotorch/core.py
def is_worse(self, a: "Solution", b: "Solution", obj_index: Optional[int] = None) -> bool:
"""
Check whether or not the first solution is worse.
It is assumed that both solutions are already evaluated.
Args:
a: The first solution.
b: The second solution.
obj_index: The objective index according to which the comparison
will be made.
Can be left as None if the problem is single-objective.
Returns:
True if `a` is worse; False otherwise.
"""
return self.compare_solutions(a, b, obj_index) < 0
kill_actors(self)
¶
Kill all the remote actors used by the Problem instance.
One might use this method to release the resources used by the remote actors.
Source code in evotorch/core.py
def kill_actors(self):
"""
Kill all the remote actors used by the Problem instance.
One might use this method to release the resources used by the
remote actors.
"""
if not self.is_main:
raise RuntimeError(
"The method `kill_actors()` can only be used on the main (i.e. non-remote)"
" Problem instance."
" However, this Problem instance is on a remote actor."
)
for actor in self._actors:
ray.kill(actor)
self._actors = None
self._actor_pool = None
manual_seed(self, seed=None)
¶
Provide a manual seed for the Problem object.
If the given seed is None, then the Problem object will remove its own stored generator, and start using the global generator of PyTorch instead. If the given seed is an integer, then the Problem object will instantiate its own generator with the given seed.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
seed |
Optional[int] |
None for using the global PyTorch generator; an integer for instantiating a new PyTorch generator with this given integer seed, specific to this Problem object. |
None |
Source code in evotorch/core.py
def manual_seed(self, seed: Optional[int] = None):
"""
Provide a manual seed for the Problem object.
If the given seed is None, then the Problem object will remove
its own stored generator, and start using the global generator
of PyTorch instead.
If the given seed is an integer, then the Problem object will
instantiate its own generator with the given seed.
Args:
seed: None for using the global PyTorch generator; an integer
for instantiating a new PyTorch generator with this given
integer seed, specific to this Problem object.
"""
if seed is None:
self._generator = None
else:
if self._generator is None:
self._generator = torch.Generator(device=self.device)
self._generator.manual_seed(seed)
normalize_obj_index(self, obj_index=None)
¶
Normalize the objective index.
If the provided index is non-negative, it is ensured that the index is valid.
If the provided index is negative, the objectives are counted in the reverse order, and the corresponding non-negative index is returned. For example, -1 is converted to a non-negative integer corresponding to the last objective.
If the provided index is None and if the problem is single-objective, the returned value is 0, which represents the only objective.
If the provided index is None and if the problem is multi-objective, an error is raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_index |
Optional[int] |
The non-normalized objective index. |
None |
Returns:
Type | Description |
---|---|
int |
The normalized objective index, as a non-negative integer. |
Source code in evotorch/core.py
def normalize_obj_index(self, obj_index: Optional[int] = None) -> int:
"""
Normalize the objective index.
If the provided index is non-negative, it is ensured that the index
is valid.
If the provided index is negative, the objectives are counted in the
reverse order, and the corresponding non-negative index is returned.
For example, -1 is converted to a non-negative integer corresponding to
the last objective.
If the provided index is None and if the problem is single-objective,
the returned value is 0, which represents the only objective.
If the provided index is None and if the problem is multi-objective,
an error is raised.
Args:
obj_index: The non-normalized objective index.
Returns:
The normalized objective index, as a non-negative integer.
"""
if obj_index is None:
if len(self.senses) == 1:
return 0
else:
raise ValueError(
"This problem is multi-objective, therefore, an explicit objective index was expected."
" However, `obj_index` was found to be None."
)
else:
obj_index = int(obj_index)
if obj_index < 0:
obj_index = len(self.senses) + obj_index
if obj_index < 0 or obj_index >= len(self.senses):
raise IndexError("Objective index out of range.")
return obj_index
sample_and_compute_gradients(self, distribution, popsize, *, num_interactions=None, popsize_max=None, obj_index=None, ranking_method=None, with_stats=True, ensure_even_popsize=False)
¶
Sample new solutions from the distribution and compute gradients.
The distribution can then be updated according to the computed gradients.
If the problem is not parallelized, and with_stats
is False,
then the result will be a single dictionary of gradients.
For example, in the case of a Gaussian distribution, the returned
gradients dictionary would look like this:
{
"mu": ..., # the gradient for the mean
"sigma": ..., # the gradient for the standard deviation
}
If the problem is not parallelized, and with_stats
is True,
then the result will be a dictionary which contains in itself
the gradients dictionary, and additional elements for providing
further information. In the case of a Gaussian distribution,
the returned dictionary with additional stats would look like
this:
{
"gradients": {
"mu": ..., # the gradient for the mean
"sigma": ..., # the gradient for the standard deviation
},
"num_solutions": ..., # how many solutions were sampled
"mean_eval": ..., # Mean of all evaluations
}
If the problem is parallelized, then the gradient computation will
be distributed among the remote actors. In more details, each actor
will sample its own solutions (such that the total population size
across all remote actors will be near the provided popsize
)
and will compute its own gradients, and will produce its own
additional stats (if with_stats
is given as True).
These remote results will then be collected by the main process,
and the final result of this method will be a list of dictionaries,
each dictionary being the result of a remote gradient computation.
The sampled solutions are temporary, and will not be kept (and will not be returned).
To customize how solutions are sampled and how gradients are
computed, one is encouraged to override
_sample_and_compute_gradients(...)
(instead of overriding this
method directly.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
distribution |
The search distribution from which the solutions will be sampled, and according to which the gradients will be computed. |
required | |
popsize |
int |
The number of solutions which will be sampled. |
required |
num_interactions |
Optional[int] |
Number of simulator interactions that must be completed (more solutions will be sampled until this threshold is reached). This argument is to be used when the problem has characteristics similar to reinforcement learning, and an adaptive population size, depending on the interactions made, is desired. Otherwise, one can leave this argument as None, in which case, there will not be any threshold based on number of interactions. |
None |
popsize_max |
Optional[int] |
To be used when |
None |
obj_index |
Optional[int] |
Index of the objective according to which the gradients will be computed. Can be left as None if the problem has only one objective. |
None |
ranking_method |
Optional[str] |
The solution ranking method to be used when computing the gradients. If not specified, the raw fitnesses will be used. |
None |
with_stats |
bool |
If given as False, then the results dictionary will only contain the gradients information. If given as True, then the results dictionary will contain within itself the gradients dictionary, and also additional elements for providing further information. The default is True. |
True |
ensure_even_popsize |
bool |
If |
False |
Returns:
Type | Description |
---|---|
Union[list, dict] |
A results dictionary when the problem is not parallelized, or list of results dictionaries when the problem is parallelized. |
Source code in evotorch/core.py
def sample_and_compute_gradients(
self,
distribution,
popsize: int,
*,
num_interactions: Optional[int] = None,
popsize_max: Optional[int] = None,
obj_index: Optional[int] = None,
ranking_method: Optional[str] = None,
with_stats: bool = True,
ensure_even_popsize: bool = False,
) -> Union[list, dict]:
"""
Sample new solutions from the distribution and compute gradients.
The distribution can then be updated according to the computed
gradients.
If the problem is not parallelized, and `with_stats` is False,
then the result will be a single dictionary of gradients.
For example, in the case of a Gaussian distribution, the returned
gradients dictionary would look like this:
{
"mu": ..., # the gradient for the mean
"sigma": ..., # the gradient for the standard deviation
}
If the problem is not parallelized, and `with_stats` is True,
then the result will be a dictionary which contains in itself
the gradients dictionary, and additional elements for providing
further information. In the case of a Gaussian distribution,
the returned dictionary with additional stats would look like
this:
{
"gradients": {
"mu": ..., # the gradient for the mean
"sigma": ..., # the gradient for the standard deviation
},
"num_solutions": ..., # how many solutions were sampled
"mean_eval": ..., # Mean of all evaluations
}
If the problem is parallelized, then the gradient computation will
be distributed among the remote actors. In more details, each actor
will sample its own solutions (such that the total population size
across all remote actors will be near the provided `popsize`)
and will compute its own gradients, and will produce its own
additional stats (if `with_stats` is given as True).
These remote results will then be collected by the main process,
and the final result of this method will be a list of dictionaries,
each dictionary being the result of a remote gradient computation.
The sampled solutions are temporary, and will not be kept
(and will not be returned).
To customize how solutions are sampled and how gradients are
computed, one is encouraged to override
`_sample_and_compute_gradients(...)` (instead of overriding this
method directly.
Args:
distribution: The search distribution from which the solutions
will be sampled, and according to which the gradients will
be computed.
popsize: The number of solutions which will be sampled.
num_interactions: Number of simulator interactions that must
be completed (more solutions will be sampled until this
threshold is reached). This argument is to be used when
the problem has characteristics similar to reinforcement
learning, and an adaptive population size, depending on
the interactions made, is desired.
Otherwise, one can leave this argument as None, in which
case, there will not be any threshold based on number
of interactions.
popsize_max: To be used when `num_interactions` is provided,
as an additional criterion for ending the solution sampling
phase. This argument can be used to prevent the population
size from growing too much while trying to satisfy the
`num_interactions`. If not needed, `popsize_max` can be left
as None.
obj_index: Index of the objective according to which the gradients
will be computed. Can be left as None if the problem has only
one objective.
ranking_method: The solution ranking method to be used when
computing the gradients.
If not specified, the raw fitnesses will be used.
with_stats: If given as False, then the results dictionary will
only contain the gradients information. If given as True,
then the results dictionary will contain within itself
the gradients dictionary, and also additional elements for
providing further information.
The default is True.
ensure_even_popsize: If `ensure_even_popsize` is True and the
problem is not parallelized, then a `popsize` given as an odd
number will cause an error. If `ensure_even_popsize` is True
and the problem is parallelized, then the remote actors will
sample their own sub-populations in such a way that their
sizes are even.
If `ensure_even_popsize` is False, whether or not the
`popsize` is even will not be checked.
When the provided `distribution` is a symmetric (or
"mirrored", or "antithetic"), then this argument must be
given as True.
Returns:
A results dictionary when the problem is not parallelized,
or list of results dictionaries when the problem is parallelized.
"""
# For problems which are configured for parallelization, make sure that the actors are created.
self._parallelize()
# Below we check if there is an inconsistency in arguments.
if (num_interactions is None) and (popsize_max is not None):
# If `num_interactions` is None, then we assume that the user does not wish an adaptive population size.
# However, at the same time, if `popsize_max` is not None, then there is an inconsistency,
# because, `popsize_max` without `num_interactions` (therefore without adaptive population size)
# does not make sense.
# This is probably a configuration error, so, we inform the user by raising an error.
raise ValueError(
f"`popsize_max` was expected as None, because `num_interactions` is None."
f" However, `popsize_max` was found as {popsize_max}."
)
# The problem instance in the main process should trigger the `before_grad_hook`.
if self.is_main:
self._before_grad_hook()
if self.is_main and (self._actors is not None) and (len(self._actors) > 0):
# If this is the main process and the problem is parallelized, then we need to split the request
# into multiple tasks, and then execute those tasks in parallel using the problem's actor pool.
if self._subbatch_size is not None:
# If `subbatch_size` is provided, then we first make sure that `popsize` is divisible by
# `subbatch_size`
if (popsize % self._subbatch_size) != 0:
raise ValueError(
f"This Problem was created with `subbatch_size` as {self._subbatch_size}."
f" When doing remote gradient computation, the requested population size must be divisible by"
f" the `subbatch_size`."
f" However, the requested population size is {popsize}, and the remainder after dividing it"
f" by `subbatch_size` is not 0 (it is {popsize % self._subbatch_size})."
)
# After making sure that `popsize` and `subbatch_size` configurations are compatible, we declare that
# we are going to have n tasks, each task imposing a sample size of `subbatch_size`.
n = int(popsize // self._subbatch_size)
popsize_per_task = [self._subbatch_size for _ in range(n)]
elif self._num_subbatches is not None:
# If `num_subbatches` is provided, then we are going to have n tasks where n is equal to the given
# `num_subbatches`.
popsize_per_task = split_workload(popsize, self._num_subbatches)
else:
# If neither `subbatch_size` nor `num_subbatches` is given, then we will split the workload in such
# a way that each actor will have its share.
popsize_per_task = split_workload(popsize, len(self._actors))
if ensure_even_popsize:
# If `ensure_even_popsize` argument is True, then we need to make sure that each tasks's popsize is
# an even number.
for i in range(len(popsize_per_task)):
if (popsize_per_task[i] % 2) != 0:
# If the i-th actor's assigned popsize is not even, increase its assigned popsize by 1.
popsize_per_task[i] += 1
# The number of tasks is finally determined by the length of `popsize_per_task` list we created above.
num_tasks = len(popsize_per_task)
if num_interactions is None:
# If the argument `num_interactions` is not given, then, for each task, we declare that
# `num_interactions` is None.
num_inter_per_task = [None for _ in range(num_tasks)]
else:
# If the argument `num_interactions` is given, then we compute each task's target number of
# interactions from its sample size.
num_inter_per_task = [
math.ceil((popsize_per_task[i] / popsize) * num_interactions) for i in range(num_tasks)
]
if popsize_max is None:
# If the argument `popsize_max` is not given, then, for each task, we declare that
# `popsize_max` is None.
popsize_max_per_task = [None for _ in range(num_tasks)]
else:
# If the argument `popsize_max` is given, then we compute each task's target maximum population size
# from its sample size.
popsize_max_per_task = [
math.ceil((popsize_per_task[i] / popsize) * popsize_max) for i in range(num_tasks)
]
# We trigger the synchronization between the main process and the remote actors.
# If this problem instance has nothing to synchronize, then `must_sync_after` will be False.
must_sync_after = self._sync_before()
# Because we want to send the distribution to remote actors, we first copy the distribution to cpu
# (unless it is already on cpu)
dist_on_cpu = distribution.to("cpu")
# Here, we use our actor pool to execute our tasks in parallel.
result = list(
self._actor_pool.map_unordered(
(
lambda a, v: a.call.remote(
"_sample_and_compute_gradients",
[dist_on_cpu, v[0]],
{
"obj_index": obj_index,
"num_interactions": v[1],
"popsize_max": v[2],
"ranking_method": ranking_method,
},
)
),
list(zip(popsize_per_task, num_inter_per_task, popsize_max_per_task)),
)
)
# At this point, all the tensors within our collected results are on the CPU.
if torch.device(self.device) != torch.device("cpu"):
# If the main device of this problem instance is not CPU, then we move the tensors to the main device.
result = cast_tensors_in_container(mapresult, device=device)
if must_sync_after:
# If a post-gradient synchronization is required, we trigger the synchronization operations.
self._sync_after()
# ####################################################
# # If this is the main process and the problem is parallelized, then we need to split the workload among
# # the remote actors, and then request each of them to compute their gradients.
#
# # We begin by getting the number of actors, and computing the `popsize` for each actor.
# num_actors = len(self._actors)
# popsize_per_actor = split_workload(popsize, num_actors)
#
# if ensure_even_popsize:
# # If `ensure_even_popsize` argument is True, then we need to make sure that each actor's popsize is
# # an even number.
# for i in range(len(popsize_per_actor)):
# if (popsize_per_actor[i] % 2) != 0:
# # If the i-th actor's assigned popsize is not even, increase its assigned popsize by 1.
# popsize_per_actor[i] += 1
#
# if num_interactions is None:
# # If `num_interactions` is None, then the `num_interactions` argument for each actor must also be
# # passed as None.
# num_int_per_actor = [None] * num_actors
# else:
# # If `num_interactions` is not None, then we split the `num_interactions` workload among the actors.
# num_int_per_actor = split_workload(num_interactions, num_actors)
#
# if popsize_max is None:
# # If `popsize_max` is None, then the `popsize_max` argument for each actor must also be None.
# popsize_max_per_actor = [None] * num_actors
# else:
# # If `popsize_max` is not None, then we split the `popsize_max` workload among the actors.
# popsize_max_per_actor = split_workload(popsize_max, num_actors)
#
# # We trigger the synchronization between the main process and the remote actors.
# # If this problem instance has nothing to synchronize, then `must_sync_after` will be False.
# must_sync_after = self._sync_before()
#
# # Because we want to send the distribution to remote actors, we first copy the distribution to cpu
# # (unless it is already on cpu)
# dist_on_cpu = distribution.to("cpu")
#
# # To each actor, we send the request of computing the gradients, and then collect the results
# result = ray.get(
# [
# self._actors[i].call.remote(
# "_gradient_computation_helper",
# [dist_on_cpu, popsize_per_actor[i]],
# dict(
# num_interactions=num_int_per_actor[i],
# popsize_max=popsize_max_per_actor[i],
# obj_index=obj_index,
# ranking_method=ranking_method,
# with_stats=with_stats,
# move_results_to_device="cpu",
# ),
# )
# for i in range(num_actors)
# ]
# )
#
# # At this point, all the tensors within our collected results are on the CPU.
#
# if torch.device(self.device) != torch.device("cpu"):
# # If the main device of this problem instance is not CPU, then we move the tensors to the main device.
# result = cast_tensors_in_container(result, device=device)
#
# if must_sync_after:
# # If a post-gradient synchronization is required, we trigger the synchronization operations.
# self._sync_after()
else:
# If the problem is not parallelized, then we request this instance itself to compute the gradients.
result = self._gradient_computation_helper(
distribution,
popsize,
popsize_max=popsize_max,
obj_index=obj_index,
ranking_method=ranking_method,
num_interactions=num_interactions,
with_stats=with_stats,
)
# The problem instance in the main process should trigger the `after_grad_hook`.
if self.is_main:
self._after_eval_status = self._after_grad_hook.accumulate_dict(result)
# We finally return the results
return result
RemoteMethod
¶
Representation of a method on a remote actor's contained Problem or reinforcement learning environment
Source code in evotorch/core.py
class RemoteMethod:
"""
Representation of a method on a remote actor's contained Problem
or reinforcement learning environment
"""
def __init__(self, method_name: str, actors: list, on_env: bool = False):
self._method_name = str(method_name)
self._actors = actors
self._on_env = bool(on_env)
def __call__(self, *args, **kwargs) -> Any:
def invoke(actor):
if self._on_env:
return actor.call_on_env.remote(self._method_name, args, kwargs)
else:
return actor.call.remote(self._method_name, args, kwargs)
return ray.get([invoke(actor) for actor in self._actors])
def __repr__(self) -> str:
if self._on_env:
further = ", on_env=True"
else:
further = ""
return f"<{type(self).__name__} {repr(self._method_name)}{further}>"
Solution
¶
Representation of a single Solution.
A Solution can be a reference to a row of a SolutionBatch (in which case it shares its storage with the SolutionBatch), or can be an independent solution. When the Solution shares its storage with a SolutionBatch, any modifications to its decision values and/or evaluation results will affect its parent SolutionBatch as well.
When a Solution object is cloned (via its clone()
method,
or via the functions copy.copy(...)
and copy.deepcopy(...)
,
a new independent Solution object will be created.
This new independent copy will NOT share its storage with
its original SolutionBatch anymore.
Source code in evotorch/core.py
class Solution:
"""
Representation of a single Solution.
A Solution can be a reference to a row of a SolutionBatch
(in which case it shares its storage with the SolutionBatch),
or can be an independent solution.
When the Solution shares its storage with a SolutionBatch,
any modifications to its decision values and/or evaluation
results will affect its parent SolutionBatch as well.
When a Solution object is cloned (via its `clone()` method,
or via the functions `copy.copy(...)` and `copy.deepcopy(...)`,
a new independent Solution object will be created.
This new independent copy will NOT share its storage with
its original SolutionBatch anymore.
"""
def __init__(self, parent: SolutionBatch, index: int):
"""
`__init__(...)`: Initialize the Solution object.
Args:
parent: The parent SolutionBatch which stores the Solution.
index: Index of the solution in SolutionBatch.
"""
if not isinstance(parent, SolutionBatch):
raise TypeError(
f"Expected a SolutionBatch as a parent, but encountered {repr(parent)},"
f" which is of type {repr(type(parent))}."
)
index = int(index)
if index < 0:
index = len(parent) + index
if not ((index >= 0) and (index <= len(parent))):
raise IndexError(f"Invalid index: {index}")
self._batch: SolutionBatch = parent[index : index + 1]
def access_values(self, *, keep_evals: bool = False) -> torch.Tensor:
"""
Access the decision values tensor of the solution.
The received tensor will be mutable.
By default, it will be assumed that the user wishes to
obtain this tensor to change the decision values, and therefore,
the evaluation results associated with this solution will be
cleared (i.e. will be NaN).
Args:
keep_evals: When this is set to True, the evaluation results
associated with this solution will be kept (i.e. will NOT
be cleared).
Returns:
The decision values tensor of the solution.
"""
return self._batch.access_values(keep_evals=keep_evals)[0]
def access_evals(self) -> torch.Tensor:
"""
Access the evaluation results of the solution.
The received tensor will be mutable.
Returns:
The evaluation results tensor of the solution.
"""
return self._batch.access_evals()[0]
@property
def values(self) -> Any:
"""
Decision values of the solution
"""
return self._batch.values[0]
@property
def evals(self) -> torch.Tensor:
"""
Evaluation results of the solution in a 1-dimensional tensor.
"""
return self._batch.evals[0]
@property
def evaluation(self) -> torch.Tensor:
"""
Get the evaluation result.
If the problem is single-objective and the problem does not
allocate any space for extra evaluation data, then a scalar
is returned.
Otherwise, this property becomes equivalent to the `evals`
property, and a 1-dimensional tensor is returned.
"""
result = self.evals
if len(result) == 1:
result = result[0]
return result
def set_values(self, values: Any):
"""
Set the decision values of the Solution.
Note that modifying the decision values will result in the
evaluation results being getting cleared (in more details,
the evaluation results tensor will be filled with NaN values).
Args:
values: New decision values for this Solution.
"""
if is_dtype_object(self.dtype):
value_tensor = ObjectArray(1)
value_tensor[0] = values
else:
value_tensor = torch.as_tensor(values, dtype=self.dtype).reshape(1, -1)
self._batch.set_values(value_tensor)
def set_evals(self, evals: torch.Tensor, eval_data: Optional[Iterable] = None):
"""
Set the evaluation results of the Solution.
Args:
evals: New evaluation result(s) for the Solution.
For single-objective problems, this argument can be given
as a scalar.
When this argument is given as a scalar (for single-objective
cases) or as a tensor which is long enough to cover for
all the objectives but not for the extra evaluation data,
then the extra evaluation data will be cleared
(in more details, extra evaluation data will be filled with
NaN values).
eval_data: Optionally, the argument `eval_data` can be used to
specify extra evaluation data separately.
`eval_data` is expected as a 1-dimensional sequence.
"""
evals = torch.as_tensor(evals, dtype=self.eval_dtype, device=self.device)
if evals.ndim in (0, 1):
evals = evals.reshape(1, -1)
else:
raise ValueError(
f"`set_evals(...)` method of a Solution expects a 1-dimensional or a 2-dimensional"
f" evaluation tensor. However, the received evaluation tensor has {evals.ndim} dimensions"
f" (having a shape of {evals.shape})."
)
if eval_data is not None:
eval_data = torch.as_tensor(eval_data, dtype=self.eval_dtype, device=self.device)
if eval_data.ndim != 1:
raise ValueError(
f"The argument `eval_data` was expected as a 1-dimensional sequence."
f" However, the shape of `eval_data` is {eval_data.shape}."
)
self._batch.set_evals(evals, eval_data)
def set_evaluation(self, evaluation: RealOrVector, eval_data: Optional[Iterable] = None):
"""
Set the evaluation results of the Solution.
This method is an alias for `set_evals(...)`, added for having
a setter counterpart for the `evaluation` property of the Solution
class.
Args:
evals: New evaluation result(s) for the Solution.
For single-objective problems, this argument can be given
as a scalar.
When this argument is given as a scalar (for single-objective
cases) or as a tensor which is long enough to cover for
all the objectives but not for the extra evaluation data,
then the extra evaluation data will be cleared
(in more details, extra evaluation data will be filled with
NaN values).
eval_data: Optionally, the argument `eval_data` can be used to
specify extra evaluation data separately.
`eval_data` is expected as a 1-dimensional sequence.
"""
self.set_evals(evaluation, eval_data)
def objective_sense(self) -> ObjectiveSense:
"""
Get the objective sense(s) of this Solution's associated Problem.
If the problem is single-objective, then a single string is returned.
If the problem is multi-objective, then the objective senses will be
returned in a list.
The returned string in the single-objective case, or each returned
string in the multi-objective case, is "min" or "max".
"""
return self._batch.objective_sense
@property
def senses(self) -> Iterable[str]:
"""
Objective sense(s) of this Solution's associated Problem.
This is a list of strings, each string being "min" or "max".
"""
return self._batch.senses
@property
def is_evaluated(self) -> bool:
"""
Whether or not the Solution is fully evaluated.
This property returns True only when all of the evaluation results
for all objectives have numeric values other than NaN.
This property assumes that the extra evaluation data is optional,
and therefore does not take into consideration whether or not the
extra evaluation data contains NaN values.
In other words, while determining whether or not a solution is fully
evaluated, only the evaluation results corresponding to the
objectives are taken into account.
"""
num_objs = len(self.senses)
with torch.no_grad():
return not bool(torch.any(torch.isnan(self._batch.evals[0, :num_objs])))
def clone(self, memo: Optional[dict] = None) -> "Solution":
"""
Get a clone of the Solution.
Note that, after this cloning operation, this Solution will not
refer to a Solution to its original parent SolutionBatch
(i.e. it will not share memory with the parent SolutionBatch).
Instead, it will be a new independent Solution object.
"""
new_batch = self._batch.clone(memo)
return Solution(new_batch, 0)
def __copy__(self):
return self.clone()
def __deepcopy__(self, memo: Optional[dict]):
return self.clone(memo)
@property
def dtype(self) -> DType:
"""
dtype of the decision values
"""
return self._batch.dtype
@property
def device(self) -> Device:
"""
The device storing the Solution
"""
return self._batch.device
@property
def eval_dtype(self) -> DType:
"""
dtype of the evaluation results
"""
return self._batch.eval_dtype
@staticmethod
def _rightmost_shape(shape: Iterable) -> torch.Size:
if len(shape) >= 2:
return torch.Size([int(shape[-1])])
else:
return torch.Size([])
@property
def shape(self) -> torch.Size:
"""
Shape of the decision values of the Solution
"""
return self._rightmost_shape(self._batch.values_shape)
def size(self) -> torch.Size:
"""
Shape of the decision values of the Solution
"""
return self.shape
@property
def eval_shape(self) -> torch.Size:
"""
Shape of the evaluation results
"""
return self._rightmost_shape(self._batch.eval_shape)
@property
def ndim(self) -> int:
"""
Number of dimensions of the decision values.
For numeric solutions (e.g. of dtype `torch.float32`), this returns
1, since such numeric solutions are kepts as 1-dimensional vectors.
When dtype is `object`, `ndim` is reported as whatever the contained
object reports as its `ndim`, or 0 if the contained object does not
have an `ndim` attribute.
"""
values = self.values
if hasattr(values, "ndim"):
return values.ndim
else:
return 0
def dim(self) -> int:
"""
This method returns the `ndim` attribute of this Solution.
"""
return self.ndim
def __len__(self) -> int:
return len(self.values)
def __iter__(self):
return self.values.__iter__()
def __reversed__(self):
return self.values.__reversed__()
def __getitem__(self, i):
return self.values.__getitem__(i)
def _to_string(self) -> str:
clsname = type(self).__name__
result = []
values = self._batch.access_values(keep_evals=True)[0]
evals = self._batch.access_evals()[0]
def write(*args):
for arg in args:
result.append(str(arg))
write("<", clsname, " values=", values)
if not torch.all(torch.isnan(evals)):
write(", evals=", evals)
write(">")
return "".join(result)
def __repr__(self) -> str:
return self._to_string()
def __str__(self) -> str:
return self._to_string()
def to_batch(self) -> SolutionBatch:
"""
Get the single-row SolutionBatch counterpart of the Solution.
The returned SolutionBatch and the Solution have shared
storage, meaning that modifying one of them affects the other.
Returns:
The SolutionBatch counterpart of the Solution.
"""
return self._batch
device: Union[str, torch.device]
property
readonly
¶
The device storing the Solution
dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
dtype of the decision values
eval_dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
dtype of the evaluation results
eval_shape: Size
property
readonly
¶
Shape of the evaluation results
evals: Tensor
property
readonly
¶
Evaluation results of the solution in a 1-dimensional tensor.
evaluation: Tensor
property
readonly
¶
Get the evaluation result.
If the problem is single-objective and the problem does not
allocate any space for extra evaluation data, then a scalar
is returned.
Otherwise, this property becomes equivalent to the evals
property, and a 1-dimensional tensor is returned.
is_evaluated: bool
property
readonly
¶
Whether or not the Solution is fully evaluated.
This property returns True only when all of the evaluation results for all objectives have numeric values other than NaN.
This property assumes that the extra evaluation data is optional, and therefore does not take into consideration whether or not the extra evaluation data contains NaN values. In other words, while determining whether or not a solution is fully evaluated, only the evaluation results corresponding to the objectives are taken into account.
ndim: int
property
readonly
¶
Number of dimensions of the decision values.
For numeric solutions (e.g. of dtype torch.float32
), this returns
1, since such numeric solutions are kepts as 1-dimensional vectors.
When dtype is object
, ndim
is reported as whatever the contained
object reports as its ndim
, or 0 if the contained object does not
have an ndim
attribute.
senses: Iterable[str]
property
readonly
¶
Objective sense(s) of this Solution's associated Problem.
This is a list of strings, each string being "min" or "max".
shape: Size
property
readonly
¶
Shape of the decision values of the Solution
values: Any
property
readonly
¶
Decision values of the solution
__init__(self, parent, index)
special
¶
__init__(...)
: Initialize the Solution object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
parent |
SolutionBatch |
The parent SolutionBatch which stores the Solution. |
required |
index |
int |
Index of the solution in SolutionBatch. |
required |
Source code in evotorch/core.py
def __init__(self, parent: SolutionBatch, index: int):
"""
`__init__(...)`: Initialize the Solution object.
Args:
parent: The parent SolutionBatch which stores the Solution.
index: Index of the solution in SolutionBatch.
"""
if not isinstance(parent, SolutionBatch):
raise TypeError(
f"Expected a SolutionBatch as a parent, but encountered {repr(parent)},"
f" which is of type {repr(type(parent))}."
)
index = int(index)
if index < 0:
index = len(parent) + index
if not ((index >= 0) and (index <= len(parent))):
raise IndexError(f"Invalid index: {index}")
self._batch: SolutionBatch = parent[index : index + 1]
access_evals(self)
¶
Access the evaluation results of the solution. The received tensor will be mutable.
Returns:
Type | Description |
---|---|
Tensor |
The evaluation results tensor of the solution. |
access_values(self, *, keep_evals=False)
¶
Access the decision values tensor of the solution. The received tensor will be mutable.
By default, it will be assumed that the user wishes to obtain this tensor to change the decision values, and therefore, the evaluation results associated with this solution will be cleared (i.e. will be NaN).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keep_evals |
bool |
When this is set to True, the evaluation results associated with this solution will be kept (i.e. will NOT be cleared). |
False |
Returns:
Type | Description |
---|---|
Tensor |
The decision values tensor of the solution. |
Source code in evotorch/core.py
def access_values(self, *, keep_evals: bool = False) -> torch.Tensor:
"""
Access the decision values tensor of the solution.
The received tensor will be mutable.
By default, it will be assumed that the user wishes to
obtain this tensor to change the decision values, and therefore,
the evaluation results associated with this solution will be
cleared (i.e. will be NaN).
Args:
keep_evals: When this is set to True, the evaluation results
associated with this solution will be kept (i.e. will NOT
be cleared).
Returns:
The decision values tensor of the solution.
"""
return self._batch.access_values(keep_evals=keep_evals)[0]
clone(self, memo=None)
¶
Get a clone of the Solution.
Note that, after this cloning operation, this Solution will not refer to a Solution to its original parent SolutionBatch (i.e. it will not share memory with the parent SolutionBatch). Instead, it will be a new independent Solution object.
Source code in evotorch/core.py
def clone(self, memo: Optional[dict] = None) -> "Solution":
"""
Get a clone of the Solution.
Note that, after this cloning operation, this Solution will not
refer to a Solution to its original parent SolutionBatch
(i.e. it will not share memory with the parent SolutionBatch).
Instead, it will be a new independent Solution object.
"""
new_batch = self._batch.clone(memo)
return Solution(new_batch, 0)
dim(self)
¶
objective_sense(self)
¶
Get the objective sense(s) of this Solution's associated Problem.
If the problem is single-objective, then a single string is returned. If the problem is multi-objective, then the objective senses will be returned in a list.
The returned string in the single-objective case, or each returned string in the multi-objective case, is "min" or "max".
Source code in evotorch/core.py
def objective_sense(self) -> ObjectiveSense:
"""
Get the objective sense(s) of this Solution's associated Problem.
If the problem is single-objective, then a single string is returned.
If the problem is multi-objective, then the objective senses will be
returned in a list.
The returned string in the single-objective case, or each returned
string in the multi-objective case, is "min" or "max".
"""
return self._batch.objective_sense
set_evals(self, evals, eval_data=None)
¶
Set the evaluation results of the Solution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evals |
Tensor |
New evaluation result(s) for the Solution. For single-objective problems, this argument can be given as a scalar. When this argument is given as a scalar (for single-objective cases) or as a tensor which is long enough to cover for all the objectives but not for the extra evaluation data, then the extra evaluation data will be cleared (in more details, extra evaluation data will be filled with NaN values). |
required |
eval_data |
Optional[Iterable] |
Optionally, the argument |
None |
Source code in evotorch/core.py
def set_evals(self, evals: torch.Tensor, eval_data: Optional[Iterable] = None):
"""
Set the evaluation results of the Solution.
Args:
evals: New evaluation result(s) for the Solution.
For single-objective problems, this argument can be given
as a scalar.
When this argument is given as a scalar (for single-objective
cases) or as a tensor which is long enough to cover for
all the objectives but not for the extra evaluation data,
then the extra evaluation data will be cleared
(in more details, extra evaluation data will be filled with
NaN values).
eval_data: Optionally, the argument `eval_data` can be used to
specify extra evaluation data separately.
`eval_data` is expected as a 1-dimensional sequence.
"""
evals = torch.as_tensor(evals, dtype=self.eval_dtype, device=self.device)
if evals.ndim in (0, 1):
evals = evals.reshape(1, -1)
else:
raise ValueError(
f"`set_evals(...)` method of a Solution expects a 1-dimensional or a 2-dimensional"
f" evaluation tensor. However, the received evaluation tensor has {evals.ndim} dimensions"
f" (having a shape of {evals.shape})."
)
if eval_data is not None:
eval_data = torch.as_tensor(eval_data, dtype=self.eval_dtype, device=self.device)
if eval_data.ndim != 1:
raise ValueError(
f"The argument `eval_data` was expected as a 1-dimensional sequence."
f" However, the shape of `eval_data` is {eval_data.shape}."
)
self._batch.set_evals(evals, eval_data)
set_evaluation(self, evaluation, eval_data=None)
¶
Set the evaluation results of the Solution.
This method is an alias for set_evals(...)
, added for having
a setter counterpart for the evaluation
property of the Solution
class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evals |
New evaluation result(s) for the Solution. For single-objective problems, this argument can be given as a scalar. When this argument is given as a scalar (for single-objective cases) or as a tensor which is long enough to cover for all the objectives but not for the extra evaluation data, then the extra evaluation data will be cleared (in more details, extra evaluation data will be filled with NaN values). |
required | |
eval_data |
Optional[Iterable] |
Optionally, the argument |
None |
Source code in evotorch/core.py
def set_evaluation(self, evaluation: RealOrVector, eval_data: Optional[Iterable] = None):
"""
Set the evaluation results of the Solution.
This method is an alias for `set_evals(...)`, added for having
a setter counterpart for the `evaluation` property of the Solution
class.
Args:
evals: New evaluation result(s) for the Solution.
For single-objective problems, this argument can be given
as a scalar.
When this argument is given as a scalar (for single-objective
cases) or as a tensor which is long enough to cover for
all the objectives but not for the extra evaluation data,
then the extra evaluation data will be cleared
(in more details, extra evaluation data will be filled with
NaN values).
eval_data: Optionally, the argument `eval_data` can be used to
specify extra evaluation data separately.
`eval_data` is expected as a 1-dimensional sequence.
"""
self.set_evals(evaluation, eval_data)
set_values(self, values)
¶
Set the decision values of the Solution.
Note that modifying the decision values will result in the evaluation results being getting cleared (in more details, the evaluation results tensor will be filled with NaN values).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
values |
Any |
New decision values for this Solution. |
required |
Source code in evotorch/core.py
def set_values(self, values: Any):
"""
Set the decision values of the Solution.
Note that modifying the decision values will result in the
evaluation results being getting cleared (in more details,
the evaluation results tensor will be filled with NaN values).
Args:
values: New decision values for this Solution.
"""
if is_dtype_object(self.dtype):
value_tensor = ObjectArray(1)
value_tensor[0] = values
else:
value_tensor = torch.as_tensor(values, dtype=self.dtype).reshape(1, -1)
self._batch.set_values(value_tensor)
size(self)
¶
to_batch(self)
¶
Get the single-row SolutionBatch counterpart of the Solution. The returned SolutionBatch and the Solution have shared storage, meaning that modifying one of them affects the other.
Returns:
Type | Description |
---|---|
SolutionBatch |
The SolutionBatch counterpart of the Solution. |
Source code in evotorch/core.py
def to_batch(self) -> SolutionBatch:
"""
Get the single-row SolutionBatch counterpart of the Solution.
The returned SolutionBatch and the Solution have shared
storage, meaning that modifying one of them affects the other.
Returns:
The SolutionBatch counterpart of the Solution.
"""
return self._batch
SolutionBatch
¶
Representation of a batch of solutions.
A SolutionBatch stores the decision values of multiple solutions in a single contiguous tensor. For numeric and fixed-length problems, this contiguous tensor is a PyTorch tensor. For not-necessarily-numeric and not-necessarily-fixed-length problems, this contiguous tensor is an ObjectArray.
The evalution results and extra evaluation data of the solutions are also stored in an additional contiguous tensor.
Interface-wise, a SolutionBatch behaves like a sequence of
Solution objects. One can get single Solution from a SolutionBatch
via the indexing operator ([]
).
Additionally, one can iterate over each solution using
the for ... in ...
statement.
One can also get a slice of a SolutionBatch.
The slicing of a SolutionBatch results in a new SolutionBatch.
With simple slicing, the obtained SolutionBatch shares its
memory with the original SolutionBatch.
With advanced slicing (i.e. the kind of slicing where the
solution indices are specified one by one, like:
mybatch[[0, 4, 2, 5]]
), the obtained SolutionBatch is a copy,
and does not share any memory with its original.
The decision values of all the stored solutions in the batch can be obtained in a read-only tensor via:
values = batch.values
If one has modified decision values and wishes to put them
into the batch, the set_values(...)
method can be used
as follows:
batch.set_values(modified_values)
The evaluation results of the solutions can be obtained in a read-only tensor via:
evals = batch.evals
If one has newly computed evaluation results, and wishes
to put them into the batch, the set_evals(...)
method
can be used as follows:
batch.set_evals(newly_computed_evals)
Source code in evotorch/core.py
class SolutionBatch:
"""
Representation of a batch of solutions.
A SolutionBatch stores the decision values of multiple solutions
in a single contiguous tensor. For numeric and fixed-length
problems, this contiguous tensor is a PyTorch tensor.
For not-necessarily-numeric and not-necessarily-fixed-length
problems, this contiguous tensor is an ObjectArray.
The evalution results and extra evaluation data of the solutions
are also stored in an additional contiguous tensor.
Interface-wise, a SolutionBatch behaves like a sequence of
Solution objects. One can get single Solution from a SolutionBatch
via the indexing operator (`[]`).
Additionally, one can iterate over each solution using
the `for ... in ...` statement.
One can also get a slice of a SolutionBatch.
The slicing of a SolutionBatch results in a new SolutionBatch.
With simple slicing, the obtained SolutionBatch shares its
memory with the original SolutionBatch.
With advanced slicing (i.e. the kind of slicing where the
solution indices are specified one by one, like:
`mybatch[[0, 4, 2, 5]]`), the obtained SolutionBatch is a copy,
and does not share any memory with its original.
The decision values of all the stored solutions in the batch
can be obtained in a read-only tensor via:
values = batch.values
If one has modified decision values and wishes to put them
into the batch, the `set_values(...)` method can be used
as follows:
batch.set_values(modified_values)
The evaluation results of the solutions can be obtained
in a read-only tensor via:
evals = batch.evals
If one has newly computed evaluation results, and wishes
to put them into the batch, the `set_evals(...)` method
can be used as follows:
batch.set_evals(newly_computed_evals)
"""
def __init__(
self,
problem: Optional[Problem] = None,
popsize: Optional[int] = None,
*,
device: Optional[Device] = None,
slice_of: Optional[Union[tuple, SolutionBatchSliceInfo]] = None,
like: Optional["SolutionBatch"] = None,
merging_of: Iterable["SolutionBatch"] = None,
empty: Optional[bool] = None,
):
self._num_objs: int
self._data: Union[torch.Tensor, ObjectArray]
self._descending: Iterable[bool]
self._slice: Optional[IndicesOrSlice] = None
if slice_of is not None:
expect_none(
"While making a new SolutionBatch via slicing",
problem=problem,
popsize=popsize,
device=device,
merging_of=merging_of,
like=like,
empty=empty,
)
source: "SolutionBatch"
slice_info: IndicesOrSlice
source, slice_info = slice_of
def safe_slice(t: torch.Tensor, slice_info):
d0 = t.ndim
t = t[slice_info]
d1 = t.ndim
if d0 != d1:
raise ValueError(
"Encountered an illegal slicing operation which would"
" change the shape of the stored tensor(s) of the"
" SolutionBatch."
)
return t
with torch.no_grad():
# self._data = source._data[slice_info]
# self._evdata = source._evdata[slice_info]
self._data = safe_slice(source._data, slice_info)
self._evdata = safe_slice(source._evdata, slice_info)
self._slice = slice_info
self._descending = source._descending
shares_storage = self._data.storage().data_ptr() == source._data.storage().data_ptr()
if not shares_storage:
self._descending = deepcopy(self._descending)
self._num_objs = source._num_objs
elif like is not None:
expect_none(
"While making a new SolutionBatch via the like=... argument",
merging_of=merging_of,
slice_of=slice_of,
)
self._data = empty_tensor_like(like._data, length=popsize, device=device)
self._evdata = empty_tensor_like(like._evdata, length=popsize, device=device)
self._evdata[:] = float("nan")
self._descending = like._descending
self._num_objs = like._num_objs
if not _opt_bool(empty, default=False):
self._fill_via_problem(problem)
elif merging_of is not None:
expect_none(
"While making a new SolutionBatch via merging",
problem=problem,
popsize=popsize,
device=device,
slice_of=slice_of,
like=like,
empty=empty,
)
# Convert `merging_of` into a list.
# While doing that, also count the total number of rows
batches = []
total_rows = 0
for batch in merging_of:
total_rows += len(batch)
batches.append(batch)
# Get essential attributes from the first batch
self._descending = deepcopy(batches[0]._descending)
self._num_objs = batches[0]._num_objs
if isinstance(batches[0]._data, ObjectArray):
def process_data(x):
return deepcopy(x)
self._data = ObjectArray(total_rows)
else:
def process_data(x):
return x
self._data = empty_tensor_like(batches[0]._data, length=total_rows)
self._evdata = empty_tensor_like(batches[0]._evdata, length=total_rows)
row_begin = 0
for batch in batches:
row_end = row_begin + len(batch)
self._data[row_begin:row_end] = process_data(batch._data)
self._evdata[row_begin:row_end] = batch._evdata
row_begin = row_end
elif problem is not None:
expect_none(
"While making a new SolutionBatch with a given problem",
slice_of=slice_of,
like=like,
merging_of=merging_of,
)
if device is None:
device = problem.device
self._num_objs = len(problem.senses)
if problem.dtype is object:
if str(device) != "cpu":
raise ValueError("Cannot create a batch containing arbitrary objects on a device other than cpu")
self._data = ObjectArray(popsize)
else:
self._data = torch.empty((popsize, problem.solution_length), device=device, dtype=problem.dtype)
if not _opt_bool(empty, default=False):
self._data[:] = problem.generate_values(len(self._data))
self._evdata = problem.make_nan(
popsize, self._num_objs + problem.eval_data_length, device=device, use_eval_dtype=True
)
self._descending = problem.get_obj_order_descending()
else:
raise ValueError("Invalid call to the __init__(...) of SolutionBatch")
def _normalize_row_index(self, i: int) -> int:
i = int(i)
org_i = i
if i < 0:
i = int(self._data.shape[0]) + i
if (i < 0) or (i > (self._data.shape[0] - 1)):
raise IndexError(f"Invalid row: {org_i}")
return i
def _normalize_obj_index(self, i: int) -> int:
i = int(i)
org_i = i
if i < 0:
i = self._num_objs + i
if (i < 0) or (i > (self._num_objs)):
raise IndexError(f"Invalid objective index: {org_i}")
return i
def _optionally_get_obj_index(self, i: Optional[int]) -> int:
if i is None:
if self._num_objs != 1:
raise ValueError(
f"The objective index was given as None."
f" However, the number of objectives is not 1,"
f" it is {self._num_objs}."
f" Therefore, the objective index is not optional,"
f" and must be provided as an integer, not as None."
)
return 0
else:
return self._normalize_obj_index(i)
@torch.no_grad()
def argsort(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""Return the indices of solutions, sorted from best to worst.
Args:
obj_index: The objective index. Can be passed as None
if the problem is single-objective. Otherwise,
expected as an int.
Returns:
A PyTorch tensor, containing the solution indices,
sorted from the best solution to the worst.
"""
obj_index = self._optionally_get_obj_index(obj_index)
descending = self._descending[obj_index]
ev_col = self._evdata[:, obj_index]
return torch.argsort(ev_col, descending=descending)
@torch.no_grad()
def arg_pareto_sort(self, crowdsort: bool = True, crowdsort_upto: Optional[int] = None) -> ParetoInfo:
"""
Pareto-sort the solutions in the batch.
The result is a namedtuple consisting of two elements:
`fronts` and `ranks`.
Let us assume that we have 5 solutions, and after a
pareto-sorting they ended up in this order:
front 0 (best front) : solution 1, solution 2
front 1 : solution 0, solution 4
front 2 (worst front): solution 3
Considering the example ordering above, the returned
ParetoInfo instance looks like this:
ParetoInfo(
fronts=[[1, 2], [0, 4], [3]],
ranks=tensor([1, 0, 0, 2, 1])
)
where `fronts` stores the solution indices grouped by
pareto fronts; and `ranks` stores, as a tensor of int64,
the pareto rank for each solution (where 0 means best
rank).
Args:
crowdsort: If given as True, each front in itself
will be sorted from the least crowding solution
to the most crowding solution.
If given as False, there will be no crowd-sorting.
crowdsort_upto: To be used with `crowdsort=True`.
If given as an integer n, crowd-sorting will be done
only in the fronts containing the first n solutions
of the population.
If given as None (and if `crowdsort=True`),
crowd-sorting will be done for each front.
Returns:
A ParetoInfo instance
"""
if not NumbaLib.is_found:
NumbaLib.warn("arg_pareto_sort")
utils = self.utils()
if not crowdsort:
if crowdsort_upto is not None:
raise ValueError(
"With the argument `crowdsort` provided as False,"
" the argument `crowdsort_upto` was expected as None."
" However, `crowdsort_upto` was found to be something"
" other than None."
)
fronts, ranks = _pareto_sort(utils, False, 0)
else:
if crowdsort_upto is None:
crowdsort_upto = len(utils)
fronts, ranks = _pareto_sort(utils, crowdsort, crowdsort_upto)
return ParetoInfo(fronts=fronts, ranks=ranks)
@torch.no_grad()
def argbest(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""Return the best solution's index
Args:
obj_index: The objective index. Can be passed as None
if the problem is single-objective. Otherwise,
expected as an int.
Returns:
The index of the best solution.
"""
obj_index = self._optionally_get_obj_index(obj_index)
descending = self._descending[obj_index]
argf = torch.argmax if descending else torch.argmin
return argf(self._evdata[:, obj_index])
@torch.no_grad()
def argworst(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""Return the worst solution's index
Args:
obj_index: The objective index. Can be passed as None
if the problem is single-objective. Otherwise,
expected as an int.
Returns:
The index of the worst solution.
"""
obj_index = self._optionally_get_obj_index(obj_index)
descending = self._descending[obj_index]
argf = torch.argmin if descending else torch.argmax
return argf(self._evdata[:, obj_index])
def _get_objective_sign(self, i_obj: int) -> float:
if self._descending[i_obj]:
return 1.0
else:
return -1.0
@torch.no_grad()
def set_values(self, values: Any, *, solutions: MaybeIndicesOrSlice = None):
"""
Set the decision values of the solutions.
Args:
values: New decision values.
solutions: Optionally a list of integer indices or an instance
of `slice(...)`, to be used if one wishes to set the
decision values of only some of the solutions.
"""
if solutions is None:
solutions = slice(None, None, None)
self._data[solutions] = values
self._evdata[solutions] = float("nan")
@torch.no_grad()
def set_evals(
self,
evals: torch.Tensor,
eval_data: Optional[torch.Tensor] = None,
*,
solutions: MaybeIndicesOrSlice = None,
):
"""
Set the evaluations of the solutions.
Args:
evals: A numeric tensor which contains the evaluation results.
Acceptable shapes are as follows:
`(n,)` only to be used for single-objective problems, sets
the evaluation results of the target `n` solutions, and clears
(where clearing means to fill with NaN values)
extra evaluation data (if the problem has allocations for such
extra evaluation data);
`(n,m)` where `m` is the number of objectives, sets the
evaluation results of the target `n` solutions, and clears
their extra evaluation data;
`(n,m+q)` where `m` is the number of objectives and `q` is the
length of extra evaluation data, sets the evaluation result
and extra data of the target `n` solutions.
eval_data: To be used only when the problem has extra evaluation
data. Optionally, one can pass the extra evaluation data
separately via this argument (instead of jointly through
a single tensor via `evals`).
The expected shape of this tensor is `(n,q)` where `n`
is the number of solutions and `q` is the length of the
extra evaluation data.
solutions: Optionally a list of integer indices or an instance
of `slice(...)`, to be used if one wishes to set the
evaluations of only some of the solutions.
Raises:
ValueError: if the given tensor has an incompatible shape.
"""
if solutions is None:
solutions = slice(None, None, None)
num_solutions = self._evdata.shape[0]
elif isinstance(solutions, slice):
num_solutions = self._evdata[solutions].shape[0]
elif is_sequence(solutions):
num_solutions = len(solutions)
total_eval_width = self._evdata.shape[1]
num_objs = self._num_objs
num_data = total_eval_width - num_objs
if evals.ndim == 1:
if num_objs != 1:
raise ValueError(
f"The method `set_evals(...)` was given a 1-dimensional tensor."
f" However, the number of objectives of the problem at hand is {num_objs}, not 1."
f" 1-dimensional evaluation tensors can only be accepted if the problem"
f" has one objective."
)
evals = evals.reshape(-1, 1)
elif evals.ndim == 2:
pass # nothing to do here
else:
if num_objs == 1:
raise ValueError(
f"The method `set_evals(...)` received a tensor with {evals.ndim} dimensions."
f" Since the problem at hand has only one objective,"
f" 1-dimensional or 2-dimensional tensors are acceptable, but not {evals.ndim}-dimensional ones."
)
else:
raise ValueError(
f"The method `set_evals(...)` received a tensor with {evals.ndim} dimensions."
f" Since the problem at hand has more than one objective (there are {num_objs} objectives),"
f" only 2-dimensional tensors are acceptable, not {evals.ndim}-dimensional ones."
)
[nrows, ncols] = evals.shape
if nrows != num_solutions:
raise ValueError(
f"Trying to set the evaluations of {num_solutions} solutions, but the given tensor has {nrows} rows."
)
if eval_data is not None:
if eval_data.ndim != 2:
raise ValueError(
f"The `eval_data` argument was expected as a 2-dimensional tensor."
f" However, the shape of the given `eval_data` is {eval_data.shape}."
)
if eval_data.shape[1] != num_data:
raise ValueError(
f"The `eval_data` argument was expected to have {num_data} columns."
f" However, the received `eval_data` has the shape: {eval_data.shape}."
)
if ncols != num_objs:
raise ValueError(
f"The method `set_evals(...)` was used with `evals` and `eval_data` arguments."
f" When both of these arguments are provided, `evals` is expected either as a 1-dimensional tensor"
f" (for single-objective cases only), or as a tensor of shape (n, m) where n is the number of"
f" solutions, and m is the number of objectives."
f" However, while the problem at hand has {num_objs} objectives,"
f" the `evals` tensor has {ncols} columns."
)
if evals.shape[0] != eval_data.shape[0]:
raise ValueError(f"The provided `evals` and `eval_data` tensors have incompatible shapes.")
self._evdata[solutions, :] = torch.hstack([evals, eval_data])
else:
if ncols == num_objs:
self._evdata[solutions, :num_objs] = evals
self._evdata[solutions, num_objs:] = float("nan")
elif ncols == total_eval_width:
self._evdata[solutions, :] = evals
else:
raise ValueError(
f"The method `set_evals(...)` received a tensor with {ncols} columns, which is incompatible."
f" Acceptable number of columns are: {num_objs}"
f" (for setting only the objective-associated evaluations and leave extra evaluation data as NaN), or"
f" {total_eval_width} (for setting both objective-associated evaluations and extra evaluation data)."
)
@property
def evals(self) -> torch.Tensor:
"""
Evaluation results of the solutions, in a ReadOnlyTensor
"""
from .tools.readonlytensor import as_read_only_tensor
with torch.no_grad():
return as_read_only_tensor(self._evdata)
@property
def values(self) -> Union[torch.Tensor, Iterable]:
"""
Decision values of the solutions, in a read-only tensor-like object
"""
from .tools.readonlytensor import as_read_only_tensor
with torch.no_grad():
return as_read_only_tensor(self._data)
# @property
# def unsafe_evals(self) -> torch.Tensor:
# """
# It is not recommended to use this property.
#
# Grants mutable access to the evaluations of the solutions.
# """
# return self._evdata
#
# @property
# def unsafe_values(self) -> Union[torch.Tensor, Iterable]:
# """
# It is not recommended to use this property.
#
# Grants mutable access to the decision values of the solutions.
# """
# return self._data
@torch.no_grad()
def access_evals(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""
Get the internal mutable tensor storing the evaluations.
IMPORTANT: This method exposes the evaluation tensor of the batch
as it is, in mutable mode. It is therefore considered unsafe to rely
on this method. Before using this method, please consider using the
`evals` property for reading the evaluation results, and using the
`set_evals(...)` method which allows one to update the evaluations
without exposing any internal tensor.
When this method is used without any argument, the returned tensor
will be of shape `(n, m)`, where `n` is the number of solutions,
and `m` is the number of objectives plus the length of extra
evaluation data.
When this method is used with an integer argument specifying an
objective index, the returned tensor will be 1-dimensional
having a length of `n`, where `n` is the number of solutions.
In this case, the returned 1-dimensional tensor will be a view
upon the evaluation results of the specified objective.
The value `nan` (not-a-number) means not evaluated yet.
Args:
obj_index: None for getting the entire 2-dimensional evaluation
tensor; an objective index (as integer) for getting a
1-dimensional mutable slice of the evaluation tensor,
the slice being a view upon the evaluation results
regarding the specified objective.
Returns:
The mutable tensor storing the evaluation information.
"""
if obj_index is None:
return self._evdata
else:
return self._evdata[:, self._normalize_obj_index(obj_index)]
@torch.no_grad()
def access_values(self, *, keep_evals: bool = False) -> Union[torch.Tensor, ObjectArray]:
"""
Get the internal mutable tensor storing the decision values.
IMPORTANT: This method exposes the internal decision values tensor of
the batch as it is, in mutable mode. It is therefore considered unsafe
to rely on this method. Before using this method, please consider
using the `values` property for reading the decision values, and using
the `set_values(...)` method which allows one to update the decision
values without exposing any internal tensor.
IMPORTANT: The default assumption of this method is that the tensor
is requested for modification purposes. Therefore, by default, as soon
as this method is called, the evaluation results of the solutions will
be cleared (where clearing means that the evaluation results will be
filled with `NaN`s).
The reasoning behind this default behavior is to prevent the modified
solutions from having outdated evaluation results.
Args:
keep_evals: If set as False, the evaluation data of the solutions
will be cleared (i.e. will be filled with `NaN`s).
If set as True, the existing evaluation data will be kept.
Returns:
The mutable tensor storing the decision values.
"""
if not keep_evals:
self.forget_evals()
return self._data
@torch.no_grad()
def forget_evals(self, *, solutions: MaybeIndicesOrSlice = None):
"""
Forget the evaluations of the solutions.
The evaluation results will be cleared, which means that they will
be filled with `NaN`s.
"""
if solutions is None:
solutions = slice(None, None, None)
self._evdata[solutions, :] = float("nan")
@torch.no_grad()
def utility(
self,
obj_index: Optional[int] = None,
*,
ranking_method: Optional[str] = None,
check_nans: bool = True,
using_values_dtype: bool = False,
) -> torch.Tensor:
"""
Return numeric scores for each solution.
Utility scores are different from evaluation results,
in the sense that utilities monotonically increase from
bad solutions to good solutions, regardless of the
objective sense.
**If ranking method is passed as None:**
if the objective sense is 'max', the evaluation results are returned
as the utility scores; otherwise, if the objective sense is 'min',
the evaluation results multiplied by -1 are returned as the
utility scores.
**If the name of a ranking method is given** (e.g. 'centered'):
then the solutions are ranked (best solutions having the
highest rank), and those ranks are returned as the utility
scores.
**If an objective index is not provided:** (i.e. passed as None)
if the problem is multi-objective, the utility scores
for each objective is given, in a tensor shaped (n, m),
n being the number of solutions and m being the number
of objectives; otherwise, if the problem is single-objective,
the utility scores for each objective is given in a 1-dimensional
tensor of length n, n being the number of solutions.
**If an objective index is provided as an int:**
the utility scores are returned in a 1-dimensional tensor
of length n, n being the number of solutions.
Args:
obj_index: Expected as None, or as an integer.
In the single-objective case, None is equivalent to 0.
In the multi-objective case, None means "for each
objective".
ranking_method: If the utility scores are to be generated
according to a certain ranking method, pass here the name
of that ranking method as a str (e.g. 'centered').
check_nans: Check for nan (not-a-number) values in the
evaluation results, which is an indication of
unevaluated solutions.
using_values_dtype: If True, the utility values will be returned
using the dtype of the decision values.
If False, the utility values will be returned using the dtype
of the evaluation data.
The default is False.
Returns:
Utility scores, in a PyTorch tensor.
"""
if obj_index is not None:
obj_index = self._normalize_obj_index(obj_index)
evdata = self._evdata[:, obj_index]
if check_nans:
if torch.any(torch.isnan(evdata)):
raise ValueError(
"Cannot compute the utility values, because there are solutions which are not evaluated yet."
)
if ranking_method is None:
result = evdata * self._get_objective_sign(obj_index)
else:
result = rank(evdata, ranking_method=ranking_method, higher_is_better=self._descending[obj_index])
if using_values_dtype:
result = torch.as_tensor(result, dtype=self._data.dtype, device=self._data.device)
return result
else:
if self._num_objs == 1:
return self.utility(
0, ranking_method=ranking_method, check_nans=check_nans, using_values_dtype=using_values_dtype
)
else:
return torch.stack(
[
self.utility(
j,
ranking_method=ranking_method,
check_nans=check_nans,
using_values_dtype=using_values_dtype,
)
for j in range(self._num_objs)
],
).T
@torch.no_grad()
def utils(
self,
*,
ranking_method: Optional[str] = None,
check_nans: bool = True,
using_values_dtype: bool = False,
) -> torch.Tensor:
"""
Return numeric scores for each solution, and for each objective.
Utility scores are different from evaluation results,
in the sense that utilities monotonically increase from
bad solutions to good solutions, regardless of the
objective sense.
Unlike the method called `utility(...)`, this function returns
a 2-dimensional tensor even when the problem is single-objective.
The result of this method is always a 2-dimensional tensor of
shape `(n, m)`, `n` being the number of solutions, `m` being the
number of objectives.
Args:
ranking_method: If the utility scores are to be generated
according to a certain ranking method, pass here the name
of that ranking method as a str (e.g. 'centered').
check_nans: Check for nan (not-a-number) values in the
evaluation results, which is an indication of
unevaluated solutions.
using_values_dtype: If True, the utility values will be returned
using the dtype of the decision values.
If False, the utility values will be returned using the dtype
of the evaluation data.
The default is False.
Returns:
Utility scores, in a 2-dimensional PyTorch tensor.
"""
result = self.utility(
ranking_method=ranking_method, check_nans=check_nans, using_values_dtype=using_values_dtype
)
if result.ndim == 1:
result = result.view(len(result), 1)
return result
def split(self, num_pieces: Optional[int] = None, *, max_size: Optional[int] = None) -> "SolutionBatchPieces":
"""Split this SolutionBatch into a specified number of pieces,
or into an unspecified number of pieces where the maximum
size of each piece is specified.
Args:
num_pieces: Can be provided as an integer n, which means
that the this SolutionBatch will be split to n pieces.
Alternatively, can be left as None if the user intends
to set max_size as an integer instead.
max_size: Can be provided as an integer n, which means
that this SolutionBatch will be split to multiple
pieces, each piece containing n solutions at most.
Alternatively, can be left as None if the user intends
to set num_pieces as an integer instead.
Returns:
A SolutionBatchPieces object, which behaves like a list of
SolutionBatch objects, each object in the list being a
slice view of this SolutionBatch object.
"""
return SolutionBatchPieces(self, num_pieces=num_pieces, max_size=max_size)
@torch.no_grad()
def concat(self, other: Union["SolutionBatch", Iterable]) -> "SolutionBatch":
"""Concatenate this SolutionBatch with the other(s).
In this context, concatenation means that the solutions of
this SolutionBatch and of the others are collected in one big
SolutionBatch object.
Args:
other: A SolutionBatch, or a sequence of SolutionBatch objects.
Returns:
A new SolutionBatch object which is the result of the
concatenation.
"""
if isinstance(other, SolutionBatch):
lst = [self, other]
else:
lst = [self]
lst.extend(list(other))
return SolutionBatch(merging_of=lst)
def take(self, indices: Iterable) -> "SolutionBatch":
"""Make a new SolutionBatch containing the specified solutions.
Args:
indices: A sequence of solution indices. These specified
solutions will make it to the newly made SolutionBatch.
Returns:
The new SolutionBatch.
"""
if is_sequence(indices):
return type(self)(slice_of=(self, indices))
else:
raise TypeError("Expected a sequence of solution indices, but got a `{type(indices)}`")
def take_best(self, n: int, *, obj_index: Optional[int] = None) -> "SolutionBatch":
"""Make a new SolutionBatch containing the best `n` solutions.
Args:
n: Number of solutions which will be taken.
obj_index: Objective index according to which the best ones
will be taken.
If `obj_index` is left as None and the problem is multi-
objective, then the solutions will be ranked according to
their fronts, and according how crowding they are, and then
the topmost `n` solutions will be taken.
If `obj_index` is left as None and the problem is single-
objective, then that single objective will be taken as the
ranking criterion.
Returns:
The new SolutionBatch.
"""
if obj_index is None and self._num_objs >= 2:
fronts, _ = self.arg_pareto_sort(crowdsort=True, crowdsort_upto=n)
indices = torch.cat(fronts)[:n]
else:
indices = self.argsort(obj_index)[:n]
return type(self)(slice_of=(self, indices))
def __getitem__(self, i):
if isinstance(i, slice) or is_sequence(i) or isinstance(i, type(...)):
return type(self)(slice_of=(self, i))
else:
return Solution(parent=self, index=i)
def __copy__(self):
return deepcopy(self)
def clone(self, memo: Optional[dict] = None) -> "SolutionBatch":
"""
Get a deepcopy of the SolutionBatch.
Returns:
An identical deep copy of the original SolutionBatch.
"""
return deepcopy(self, memo=memo)
def __len__(self):
return int(self._data.shape[0])
def __iter__(self):
for i in range(len(self)):
yield self[i]
@property
def device(self) -> Device:
"""
The device in which the solutions are stored.
"""
return self._data.device
@property
def dtype(self) -> DType:
"""
The dtype of the decision values of the solutions.
This property exists as an alias for the property
`.values_dtype`.
"""
return self._data.dtype
@property
def values_dtype(self) -> DType:
"""
The dtype of the decision values of the solutions.
"""
return self._data.dtype
@property
def eval_dtype(self) -> DType:
"""
The dtype of the evaluation results and extra evaluation data
of the solutions.
"""
return self._evdata.dtype
@property
def values_shape(self) -> torch.Size:
"""
The shape of the batch's decision values tensor, as a tuple (n, l),
where `n` is the number of solutions, and `l` is the length
of a single solution.
If `dtype=None`, then there is no fixed length.
Therefore, the shape is returned as (n,).
"""
return self._data.shape
@property
def eval_shape(self) -> torch.Size:
"""
The shape of the batch's evaluation tensor, as a tuple (n, l),
where `n` is the number of solutions, and `l` is an integer
which is equal to number of objectives plus the length of the
extra evaluation data, if any.
"""
return self._evdata.shape
@property
def solution_length(self) -> Optional[int]:
"""
Get the length of a solution, if this batch is numeric.
For non-numeric batches (i.e. batches with dtype=object),
`solution_length` is given as None.
"""
if self._data.ndim == 2:
return self._data.shape[1]
else:
return None
@property
def objective_sense(self) -> ObjectiveSense:
"""
Get the objective sense(s) of this batch's associated Problem.
If the problem is single-objective, then a single string is returned.
If the problem is multi-objective, then the objective senses will be
returned in a list.
The returned string in the single-objective case, or each returned
string in the multi-objective case, is "min" or "max".
"""
if len(self.senses) == 1:
return self.senses[0]
else:
return self.senses
@property
def senses(self) -> Iterable[str]:
"""
Objective sense(s) of this batch's associated Problem.
This is a list of strings, each string being "min" or "max".
"""
def desc_to_sense(desc: bool) -> str:
return "max" if desc else "min"
return [desc_to_sense(desc) for desc in self._descending]
@staticmethod
def cat(solution_batches: Iterable) -> "SolutionBatch":
"""
Concatenate multiple SolutionBatch instances into one.
Args:
solution_batches: An Iterable of SolutionBatch objects to
concatenate.
Returns:
The result of the concatenation, as a new SolutionBatch.
"""
first = None
rest = []
for i, batch in enumerate(solution_batches):
if not isinstance(batch, SolutionBatch):
raise TypeError(f"Expected a SolutionBatch but got {repr(batch)}")
if i == 0:
first = batch
else:
rest.append(batch)
return first.concat(rest)
device: Union[str, torch.device]
property
readonly
¶
The device in which the solutions are stored.
dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
The dtype of the decision values of the solutions.
This property exists as an alias for the property
.values_dtype
.
eval_dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
The dtype of the evaluation results and extra evaluation data of the solutions.
eval_shape: Size
property
readonly
¶
The shape of the batch's evaluation tensor, as a tuple (n, l),
where n
is the number of solutions, and l
is an integer
which is equal to number of objectives plus the length of the
extra evaluation data, if any.
evals: Tensor
property
readonly
¶
Evaluation results of the solutions, in a ReadOnlyTensor
objective_sense: Union[str, Iterable[str]]
property
readonly
¶
Get the objective sense(s) of this batch's associated Problem.
If the problem is single-objective, then a single string is returned. If the problem is multi-objective, then the objective senses will be returned in a list.
The returned string in the single-objective case, or each returned string in the multi-objective case, is "min" or "max".
senses: Iterable[str]
property
readonly
¶
Objective sense(s) of this batch's associated Problem.
This is a list of strings, each string being "min" or "max".
solution_length: Optional[int]
property
readonly
¶
Get the length of a solution, if this batch is numeric.
For non-numeric batches (i.e. batches with dtype=object),
solution_length
is given as None.
values: Union[torch.Tensor, Iterable]
property
readonly
¶
Decision values of the solutions, in a read-only tensor-like object
values_dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
The dtype of the decision values of the solutions.
values_shape: Size
property
readonly
¶
The shape of the batch's decision values tensor, as a tuple (n, l),
where n
is the number of solutions, and l
is the length
of a single solution.
If dtype=None
, then there is no fixed length.
Therefore, the shape is returned as (n,).
access_evals(self, obj_index=None)
¶
Get the internal mutable tensor storing the evaluations.
IMPORTANT: This method exposes the evaluation tensor of the batch
as it is, in mutable mode. It is therefore considered unsafe to rely
on this method. Before using this method, please consider using the
evals
property for reading the evaluation results, and using the
set_evals(...)
method which allows one to update the evaluations
without exposing any internal tensor.
When this method is used without any argument, the returned tensor
will be of shape (n, m)
, where n
is the number of solutions,
and m
is the number of objectives plus the length of extra
evaluation data.
When this method is used with an integer argument specifying an
objective index, the returned tensor will be 1-dimensional
having a length of n
, where n
is the number of solutions.
In this case, the returned 1-dimensional tensor will be a view
upon the evaluation results of the specified objective.
The value nan
(not-a-number) means not evaluated yet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_index |
Optional[int] |
None for getting the entire 2-dimensional evaluation tensor; an objective index (as integer) for getting a 1-dimensional mutable slice of the evaluation tensor, the slice being a view upon the evaluation results regarding the specified objective. |
None |
Returns:
Type | Description |
---|---|
Tensor |
The mutable tensor storing the evaluation information. |
Source code in evotorch/core.py
@torch.no_grad()
def access_evals(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""
Get the internal mutable tensor storing the evaluations.
IMPORTANT: This method exposes the evaluation tensor of the batch
as it is, in mutable mode. It is therefore considered unsafe to rely
on this method. Before using this method, please consider using the
`evals` property for reading the evaluation results, and using the
`set_evals(...)` method which allows one to update the evaluations
without exposing any internal tensor.
When this method is used without any argument, the returned tensor
will be of shape `(n, m)`, where `n` is the number of solutions,
and `m` is the number of objectives plus the length of extra
evaluation data.
When this method is used with an integer argument specifying an
objective index, the returned tensor will be 1-dimensional
having a length of `n`, where `n` is the number of solutions.
In this case, the returned 1-dimensional tensor will be a view
upon the evaluation results of the specified objective.
The value `nan` (not-a-number) means not evaluated yet.
Args:
obj_index: None for getting the entire 2-dimensional evaluation
tensor; an objective index (as integer) for getting a
1-dimensional mutable slice of the evaluation tensor,
the slice being a view upon the evaluation results
regarding the specified objective.
Returns:
The mutable tensor storing the evaluation information.
"""
if obj_index is None:
return self._evdata
else:
return self._evdata[:, self._normalize_obj_index(obj_index)]
access_values(self, *, keep_evals=False)
¶
Get the internal mutable tensor storing the decision values.
IMPORTANT: This method exposes the internal decision values tensor of
the batch as it is, in mutable mode. It is therefore considered unsafe
to rely on this method. Before using this method, please consider
using the values
property for reading the decision values, and using
the set_values(...)
method which allows one to update the decision
values without exposing any internal tensor.
IMPORTANT: The default assumption of this method is that the tensor
is requested for modification purposes. Therefore, by default, as soon
as this method is called, the evaluation results of the solutions will
be cleared (where clearing means that the evaluation results will be
filled with NaN
s).
The reasoning behind this default behavior is to prevent the modified
solutions from having outdated evaluation results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keep_evals |
bool |
If set as False, the evaluation data of the solutions
will be cleared (i.e. will be filled with |
False |
Returns:
Type | Description |
---|---|
Union[torch.Tensor, evotorch.tools.objectarray.ObjectArray] |
The mutable tensor storing the decision values. |
Source code in evotorch/core.py
@torch.no_grad()
def access_values(self, *, keep_evals: bool = False) -> Union[torch.Tensor, ObjectArray]:
"""
Get the internal mutable tensor storing the decision values.
IMPORTANT: This method exposes the internal decision values tensor of
the batch as it is, in mutable mode. It is therefore considered unsafe
to rely on this method. Before using this method, please consider
using the `values` property for reading the decision values, and using
the `set_values(...)` method which allows one to update the decision
values without exposing any internal tensor.
IMPORTANT: The default assumption of this method is that the tensor
is requested for modification purposes. Therefore, by default, as soon
as this method is called, the evaluation results of the solutions will
be cleared (where clearing means that the evaluation results will be
filled with `NaN`s).
The reasoning behind this default behavior is to prevent the modified
solutions from having outdated evaluation results.
Args:
keep_evals: If set as False, the evaluation data of the solutions
will be cleared (i.e. will be filled with `NaN`s).
If set as True, the existing evaluation data will be kept.
Returns:
The mutable tensor storing the decision values.
"""
if not keep_evals:
self.forget_evals()
return self._data
arg_pareto_sort(self, crowdsort=True, crowdsort_upto=None)
¶
Pareto-sort the solutions in the batch.
The result is a namedtuple consisting of two elements:
fronts
and ranks
.
Let us assume that we have 5 solutions, and after a
pareto-sorting they ended up in this order:
front 0 (best front) : solution 1, solution 2
front 1 : solution 0, solution 4
front 2 (worst front): solution 3
Considering the example ordering above, the returned ParetoInfo instance looks like this:
ParetoInfo(
fronts=[[1, 2], [0, 4], [3]],
ranks=tensor([1, 0, 0, 2, 1])
)
where fronts
stores the solution indices grouped by
pareto fronts; and ranks
stores, as a tensor of int64,
the pareto rank for each solution (where 0 means best
rank).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
crowdsort |
bool |
If given as True, each front in itself will be sorted from the least crowding solution to the most crowding solution. If given as False, there will be no crowd-sorting. |
True |
crowdsort_upto |
Optional[int] |
To be used with |
None |
Returns:
Type | Description |
---|---|
ParetoInfo |
A ParetoInfo instance |
Source code in evotorch/core.py
@torch.no_grad()
def arg_pareto_sort(self, crowdsort: bool = True, crowdsort_upto: Optional[int] = None) -> ParetoInfo:
"""
Pareto-sort the solutions in the batch.
The result is a namedtuple consisting of two elements:
`fronts` and `ranks`.
Let us assume that we have 5 solutions, and after a
pareto-sorting they ended up in this order:
front 0 (best front) : solution 1, solution 2
front 1 : solution 0, solution 4
front 2 (worst front): solution 3
Considering the example ordering above, the returned
ParetoInfo instance looks like this:
ParetoInfo(
fronts=[[1, 2], [0, 4], [3]],
ranks=tensor([1, 0, 0, 2, 1])
)
where `fronts` stores the solution indices grouped by
pareto fronts; and `ranks` stores, as a tensor of int64,
the pareto rank for each solution (where 0 means best
rank).
Args:
crowdsort: If given as True, each front in itself
will be sorted from the least crowding solution
to the most crowding solution.
If given as False, there will be no crowd-sorting.
crowdsort_upto: To be used with `crowdsort=True`.
If given as an integer n, crowd-sorting will be done
only in the fronts containing the first n solutions
of the population.
If given as None (and if `crowdsort=True`),
crowd-sorting will be done for each front.
Returns:
A ParetoInfo instance
"""
if not NumbaLib.is_found:
NumbaLib.warn("arg_pareto_sort")
utils = self.utils()
if not crowdsort:
if crowdsort_upto is not None:
raise ValueError(
"With the argument `crowdsort` provided as False,"
" the argument `crowdsort_upto` was expected as None."
" However, `crowdsort_upto` was found to be something"
" other than None."
)
fronts, ranks = _pareto_sort(utils, False, 0)
else:
if crowdsort_upto is None:
crowdsort_upto = len(utils)
fronts, ranks = _pareto_sort(utils, crowdsort, crowdsort_upto)
return ParetoInfo(fronts=fronts, ranks=ranks)
argbest(self, obj_index=None)
¶
Return the best solution's index
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_index |
Optional[int] |
The objective index. Can be passed as None if the problem is single-objective. Otherwise, expected as an int. |
None |
Returns:
Type | Description |
---|---|
Tensor |
The index of the best solution. |
Source code in evotorch/core.py
@torch.no_grad()
def argbest(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""Return the best solution's index
Args:
obj_index: The objective index. Can be passed as None
if the problem is single-objective. Otherwise,
expected as an int.
Returns:
The index of the best solution.
"""
obj_index = self._optionally_get_obj_index(obj_index)
descending = self._descending[obj_index]
argf = torch.argmax if descending else torch.argmin
return argf(self._evdata[:, obj_index])
argsort(self, obj_index=None)
¶
Return the indices of solutions, sorted from best to worst.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_index |
Optional[int] |
The objective index. Can be passed as None if the problem is single-objective. Otherwise, expected as an int. |
None |
Returns:
Type | Description |
---|---|
Tensor |
A PyTorch tensor, containing the solution indices, sorted from the best solution to the worst. |
Source code in evotorch/core.py
@torch.no_grad()
def argsort(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""Return the indices of solutions, sorted from best to worst.
Args:
obj_index: The objective index. Can be passed as None
if the problem is single-objective. Otherwise,
expected as an int.
Returns:
A PyTorch tensor, containing the solution indices,
sorted from the best solution to the worst.
"""
obj_index = self._optionally_get_obj_index(obj_index)
descending = self._descending[obj_index]
ev_col = self._evdata[:, obj_index]
return torch.argsort(ev_col, descending=descending)
argworst(self, obj_index=None)
¶
Return the worst solution's index
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_index |
Optional[int] |
The objective index. Can be passed as None if the problem is single-objective. Otherwise, expected as an int. |
None |
Returns:
Type | Description |
---|---|
Tensor |
The index of the worst solution. |
Source code in evotorch/core.py
@torch.no_grad()
def argworst(self, obj_index: Optional[int] = None) -> torch.Tensor:
"""Return the worst solution's index
Args:
obj_index: The objective index. Can be passed as None
if the problem is single-objective. Otherwise,
expected as an int.
Returns:
The index of the worst solution.
"""
obj_index = self._optionally_get_obj_index(obj_index)
descending = self._descending[obj_index]
argf = torch.argmin if descending else torch.argmax
return argf(self._evdata[:, obj_index])
cat(solution_batches)
staticmethod
¶
Concatenate multiple SolutionBatch instances into one.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solution_batches |
Iterable |
An Iterable of SolutionBatch objects to concatenate. |
required |
Returns:
Type | Description |
---|---|
SolutionBatch |
The result of the concatenation, as a new SolutionBatch. |
Source code in evotorch/core.py
@staticmethod
def cat(solution_batches: Iterable) -> "SolutionBatch":
"""
Concatenate multiple SolutionBatch instances into one.
Args:
solution_batches: An Iterable of SolutionBatch objects to
concatenate.
Returns:
The result of the concatenation, as a new SolutionBatch.
"""
first = None
rest = []
for i, batch in enumerate(solution_batches):
if not isinstance(batch, SolutionBatch):
raise TypeError(f"Expected a SolutionBatch but got {repr(batch)}")
if i == 0:
first = batch
else:
rest.append(batch)
return first.concat(rest)
clone(self, memo=None)
¶
Get a deepcopy of the SolutionBatch.
Returns:
Type | Description |
---|---|
SolutionBatch |
An identical deep copy of the original SolutionBatch. |
concat(self, other)
¶
Concatenate this SolutionBatch with the other(s).
In this context, concatenation means that the solutions of this SolutionBatch and of the others are collected in one big SolutionBatch object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Union[SolutionBatch, Iterable] |
A SolutionBatch, or a sequence of SolutionBatch objects. |
required |
Returns:
Type | Description |
---|---|
SolutionBatch |
A new SolutionBatch object which is the result of the concatenation. |
Source code in evotorch/core.py
@torch.no_grad()
def concat(self, other: Union["SolutionBatch", Iterable]) -> "SolutionBatch":
"""Concatenate this SolutionBatch with the other(s).
In this context, concatenation means that the solutions of
this SolutionBatch and of the others are collected in one big
SolutionBatch object.
Args:
other: A SolutionBatch, or a sequence of SolutionBatch objects.
Returns:
A new SolutionBatch object which is the result of the
concatenation.
"""
if isinstance(other, SolutionBatch):
lst = [self, other]
else:
lst = [self]
lst.extend(list(other))
return SolutionBatch(merging_of=lst)
forget_evals(self, *, solutions=None)
¶
Forget the evaluations of the solutions.
The evaluation results will be cleared, which means that they will
be filled with NaN
s.
Source code in evotorch/core.py
@torch.no_grad()
def forget_evals(self, *, solutions: MaybeIndicesOrSlice = None):
"""
Forget the evaluations of the solutions.
The evaluation results will be cleared, which means that they will
be filled with `NaN`s.
"""
if solutions is None:
solutions = slice(None, None, None)
self._evdata[solutions, :] = float("nan")
set_evals(self, evals, eval_data=None, *, solutions=None)
¶
Set the evaluations of the solutions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evals |
Tensor |
A numeric tensor which contains the evaluation results.
Acceptable shapes are as follows:
|
required |
eval_data |
Optional[torch.Tensor] |
To be used only when the problem has extra evaluation
data. Optionally, one can pass the extra evaluation data
separately via this argument (instead of jointly through
a single tensor via |
None |
solutions |
Union[int, Iterable[int], slice] |
Optionally a list of integer indices or an instance
of |
None |
Exceptions:
Type | Description |
---|---|
ValueError |
if the given tensor has an incompatible shape. |
Source code in evotorch/core.py
@torch.no_grad()
def set_evals(
self,
evals: torch.Tensor,
eval_data: Optional[torch.Tensor] = None,
*,
solutions: MaybeIndicesOrSlice = None,
):
"""
Set the evaluations of the solutions.
Args:
evals: A numeric tensor which contains the evaluation results.
Acceptable shapes are as follows:
`(n,)` only to be used for single-objective problems, sets
the evaluation results of the target `n` solutions, and clears
(where clearing means to fill with NaN values)
extra evaluation data (if the problem has allocations for such
extra evaluation data);
`(n,m)` where `m` is the number of objectives, sets the
evaluation results of the target `n` solutions, and clears
their extra evaluation data;
`(n,m+q)` where `m` is the number of objectives and `q` is the
length of extra evaluation data, sets the evaluation result
and extra data of the target `n` solutions.
eval_data: To be used only when the problem has extra evaluation
data. Optionally, one can pass the extra evaluation data
separately via this argument (instead of jointly through
a single tensor via `evals`).
The expected shape of this tensor is `(n,q)` where `n`
is the number of solutions and `q` is the length of the
extra evaluation data.
solutions: Optionally a list of integer indices or an instance
of `slice(...)`, to be used if one wishes to set the
evaluations of only some of the solutions.
Raises:
ValueError: if the given tensor has an incompatible shape.
"""
if solutions is None:
solutions = slice(None, None, None)
num_solutions = self._evdata.shape[0]
elif isinstance(solutions, slice):
num_solutions = self._evdata[solutions].shape[0]
elif is_sequence(solutions):
num_solutions = len(solutions)
total_eval_width = self._evdata.shape[1]
num_objs = self._num_objs
num_data = total_eval_width - num_objs
if evals.ndim == 1:
if num_objs != 1:
raise ValueError(
f"The method `set_evals(...)` was given a 1-dimensional tensor."
f" However, the number of objectives of the problem at hand is {num_objs}, not 1."
f" 1-dimensional evaluation tensors can only be accepted if the problem"
f" has one objective."
)
evals = evals.reshape(-1, 1)
elif evals.ndim == 2:
pass # nothing to do here
else:
if num_objs == 1:
raise ValueError(
f"The method `set_evals(...)` received a tensor with {evals.ndim} dimensions."
f" Since the problem at hand has only one objective,"
f" 1-dimensional or 2-dimensional tensors are acceptable, but not {evals.ndim}-dimensional ones."
)
else:
raise ValueError(
f"The method `set_evals(...)` received a tensor with {evals.ndim} dimensions."
f" Since the problem at hand has more than one objective (there are {num_objs} objectives),"
f" only 2-dimensional tensors are acceptable, not {evals.ndim}-dimensional ones."
)
[nrows, ncols] = evals.shape
if nrows != num_solutions:
raise ValueError(
f"Trying to set the evaluations of {num_solutions} solutions, but the given tensor has {nrows} rows."
)
if eval_data is not None:
if eval_data.ndim != 2:
raise ValueError(
f"The `eval_data` argument was expected as a 2-dimensional tensor."
f" However, the shape of the given `eval_data` is {eval_data.shape}."
)
if eval_data.shape[1] != num_data:
raise ValueError(
f"The `eval_data` argument was expected to have {num_data} columns."
f" However, the received `eval_data` has the shape: {eval_data.shape}."
)
if ncols != num_objs:
raise ValueError(
f"The method `set_evals(...)` was used with `evals` and `eval_data` arguments."
f" When both of these arguments are provided, `evals` is expected either as a 1-dimensional tensor"
f" (for single-objective cases only), or as a tensor of shape (n, m) where n is the number of"
f" solutions, and m is the number of objectives."
f" However, while the problem at hand has {num_objs} objectives,"
f" the `evals` tensor has {ncols} columns."
)
if evals.shape[0] != eval_data.shape[0]:
raise ValueError(f"The provided `evals` and `eval_data` tensors have incompatible shapes.")
self._evdata[solutions, :] = torch.hstack([evals, eval_data])
else:
if ncols == num_objs:
self._evdata[solutions, :num_objs] = evals
self._evdata[solutions, num_objs:] = float("nan")
elif ncols == total_eval_width:
self._evdata[solutions, :] = evals
else:
raise ValueError(
f"The method `set_evals(...)` received a tensor with {ncols} columns, which is incompatible."
f" Acceptable number of columns are: {num_objs}"
f" (for setting only the objective-associated evaluations and leave extra evaluation data as NaN), or"
f" {total_eval_width} (for setting both objective-associated evaluations and extra evaluation data)."
)
set_values(self, values, *, solutions=None)
¶
Set the decision values of the solutions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
values |
Any |
New decision values. |
required |
solutions |
Union[int, Iterable[int], slice] |
Optionally a list of integer indices or an instance
of |
None |
Source code in evotorch/core.py
@torch.no_grad()
def set_values(self, values: Any, *, solutions: MaybeIndicesOrSlice = None):
"""
Set the decision values of the solutions.
Args:
values: New decision values.
solutions: Optionally a list of integer indices or an instance
of `slice(...)`, to be used if one wishes to set the
decision values of only some of the solutions.
"""
if solutions is None:
solutions = slice(None, None, None)
self._data[solutions] = values
self._evdata[solutions] = float("nan")
split(self, num_pieces=None, *, max_size=None)
¶
Split this SolutionBatch into a specified number of pieces, or into an unspecified number of pieces where the maximum size of each piece is specified.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_pieces |
Optional[int] |
Can be provided as an integer n, which means that the this SolutionBatch will be split to n pieces. Alternatively, can be left as None if the user intends to set max_size as an integer instead. |
None |
max_size |
Optional[int] |
Can be provided as an integer n, which means that this SolutionBatch will be split to multiple pieces, each piece containing n solutions at most. Alternatively, can be left as None if the user intends to set num_pieces as an integer instead. |
None |
Returns:
Type | Description |
---|---|
SolutionBatchPieces |
A SolutionBatchPieces object, which behaves like a list of SolutionBatch objects, each object in the list being a slice view of this SolutionBatch object. |
Source code in evotorch/core.py
def split(self, num_pieces: Optional[int] = None, *, max_size: Optional[int] = None) -> "SolutionBatchPieces":
"""Split this SolutionBatch into a specified number of pieces,
or into an unspecified number of pieces where the maximum
size of each piece is specified.
Args:
num_pieces: Can be provided as an integer n, which means
that the this SolutionBatch will be split to n pieces.
Alternatively, can be left as None if the user intends
to set max_size as an integer instead.
max_size: Can be provided as an integer n, which means
that this SolutionBatch will be split to multiple
pieces, each piece containing n solutions at most.
Alternatively, can be left as None if the user intends
to set num_pieces as an integer instead.
Returns:
A SolutionBatchPieces object, which behaves like a list of
SolutionBatch objects, each object in the list being a
slice view of this SolutionBatch object.
"""
return SolutionBatchPieces(self, num_pieces=num_pieces, max_size=max_size)
take(self, indices)
¶
Make a new SolutionBatch containing the specified solutions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indices |
Iterable |
A sequence of solution indices. These specified solutions will make it to the newly made SolutionBatch. |
required |
Returns:
Type | Description |
---|---|
SolutionBatch |
The new SolutionBatch. |
Source code in evotorch/core.py
def take(self, indices: Iterable) -> "SolutionBatch":
"""Make a new SolutionBatch containing the specified solutions.
Args:
indices: A sequence of solution indices. These specified
solutions will make it to the newly made SolutionBatch.
Returns:
The new SolutionBatch.
"""
if is_sequence(indices):
return type(self)(slice_of=(self, indices))
else:
raise TypeError("Expected a sequence of solution indices, but got a `{type(indices)}`")
take_best(self, n, *, obj_index=None)
¶
Make a new SolutionBatch containing the best n
solutions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int |
Number of solutions which will be taken. |
required |
obj_index |
Optional[int] |
Objective index according to which the best ones
will be taken.
If |
None |
Returns:
Type | Description |
---|---|
SolutionBatch |
The new SolutionBatch. |
Source code in evotorch/core.py
def take_best(self, n: int, *, obj_index: Optional[int] = None) -> "SolutionBatch":
"""Make a new SolutionBatch containing the best `n` solutions.
Args:
n: Number of solutions which will be taken.
obj_index: Objective index according to which the best ones
will be taken.
If `obj_index` is left as None and the problem is multi-
objective, then the solutions will be ranked according to
their fronts, and according how crowding they are, and then
the topmost `n` solutions will be taken.
If `obj_index` is left as None and the problem is single-
objective, then that single objective will be taken as the
ranking criterion.
Returns:
The new SolutionBatch.
"""
if obj_index is None and self._num_objs >= 2:
fronts, _ = self.arg_pareto_sort(crowdsort=True, crowdsort_upto=n)
indices = torch.cat(fronts)[:n]
else:
indices = self.argsort(obj_index)[:n]
return type(self)(slice_of=(self, indices))
utility(self, obj_index=None, *, ranking_method=None, check_nans=True, using_values_dtype=False)
¶
Return numeric scores for each solution.
Utility scores are different from evaluation results, in the sense that utilities monotonically increase from bad solutions to good solutions, regardless of the objective sense.
If ranking method is passed as None: if the objective sense is 'max', the evaluation results are returned as the utility scores; otherwise, if the objective sense is 'min', the evaluation results multiplied by -1 are returned as the utility scores.
If the name of a ranking method is given (e.g. 'centered'): then the solutions are ranked (best solutions having the highest rank), and those ranks are returned as the utility scores.
If an objective index is not provided: (i.e. passed as None) if the problem is multi-objective, the utility scores for each objective is given, in a tensor shaped (n, m), n being the number of solutions and m being the number of objectives; otherwise, if the problem is single-objective, the utility scores for each objective is given in a 1-dimensional tensor of length n, n being the number of solutions.
If an objective index is provided as an int: the utility scores are returned in a 1-dimensional tensor of length n, n being the number of solutions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_index |
Optional[int] |
Expected as None, or as an integer. In the single-objective case, None is equivalent to 0. In the multi-objective case, None means "for each objective". |
None |
ranking_method |
Optional[str] |
If the utility scores are to be generated according to a certain ranking method, pass here the name of that ranking method as a str (e.g. 'centered'). |
None |
check_nans |
bool |
Check for nan (not-a-number) values in the evaluation results, which is an indication of unevaluated solutions. |
True |
using_values_dtype |
bool |
If True, the utility values will be returned using the dtype of the decision values. If False, the utility values will be returned using the dtype of the evaluation data. The default is False. |
False |
Returns:
Type | Description |
---|---|
Tensor |
Utility scores, in a PyTorch tensor. |
Source code in evotorch/core.py
@torch.no_grad()
def utility(
self,
obj_index: Optional[int] = None,
*,
ranking_method: Optional[str] = None,
check_nans: bool = True,
using_values_dtype: bool = False,
) -> torch.Tensor:
"""
Return numeric scores for each solution.
Utility scores are different from evaluation results,
in the sense that utilities monotonically increase from
bad solutions to good solutions, regardless of the
objective sense.
**If ranking method is passed as None:**
if the objective sense is 'max', the evaluation results are returned
as the utility scores; otherwise, if the objective sense is 'min',
the evaluation results multiplied by -1 are returned as the
utility scores.
**If the name of a ranking method is given** (e.g. 'centered'):
then the solutions are ranked (best solutions having the
highest rank), and those ranks are returned as the utility
scores.
**If an objective index is not provided:** (i.e. passed as None)
if the problem is multi-objective, the utility scores
for each objective is given, in a tensor shaped (n, m),
n being the number of solutions and m being the number
of objectives; otherwise, if the problem is single-objective,
the utility scores for each objective is given in a 1-dimensional
tensor of length n, n being the number of solutions.
**If an objective index is provided as an int:**
the utility scores are returned in a 1-dimensional tensor
of length n, n being the number of solutions.
Args:
obj_index: Expected as None, or as an integer.
In the single-objective case, None is equivalent to 0.
In the multi-objective case, None means "for each
objective".
ranking_method: If the utility scores are to be generated
according to a certain ranking method, pass here the name
of that ranking method as a str (e.g. 'centered').
check_nans: Check for nan (not-a-number) values in the
evaluation results, which is an indication of
unevaluated solutions.
using_values_dtype: If True, the utility values will be returned
using the dtype of the decision values.
If False, the utility values will be returned using the dtype
of the evaluation data.
The default is False.
Returns:
Utility scores, in a PyTorch tensor.
"""
if obj_index is not None:
obj_index = self._normalize_obj_index(obj_index)
evdata = self._evdata[:, obj_index]
if check_nans:
if torch.any(torch.isnan(evdata)):
raise ValueError(
"Cannot compute the utility values, because there are solutions which are not evaluated yet."
)
if ranking_method is None:
result = evdata * self._get_objective_sign(obj_index)
else:
result = rank(evdata, ranking_method=ranking_method, higher_is_better=self._descending[obj_index])
if using_values_dtype:
result = torch.as_tensor(result, dtype=self._data.dtype, device=self._data.device)
return result
else:
if self._num_objs == 1:
return self.utility(
0, ranking_method=ranking_method, check_nans=check_nans, using_values_dtype=using_values_dtype
)
else:
return torch.stack(
[
self.utility(
j,
ranking_method=ranking_method,
check_nans=check_nans,
using_values_dtype=using_values_dtype,
)
for j in range(self._num_objs)
],
).T
utils(self, *, ranking_method=None, check_nans=True, using_values_dtype=False)
¶
Return numeric scores for each solution, and for each objective. Utility scores are different from evaluation results, in the sense that utilities monotonically increase from bad solutions to good solutions, regardless of the objective sense.
Unlike the method called utility(...)
, this function returns
a 2-dimensional tensor even when the problem is single-objective.
The result of this method is always a 2-dimensional tensor of
shape (n, m)
, n
being the number of solutions, m
being the
number of objectives.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ranking_method |
Optional[str] |
If the utility scores are to be generated according to a certain ranking method, pass here the name of that ranking method as a str (e.g. 'centered'). |
None |
check_nans |
bool |
Check for nan (not-a-number) values in the evaluation results, which is an indication of unevaluated solutions. |
True |
using_values_dtype |
bool |
If True, the utility values will be returned using the dtype of the decision values. If False, the utility values will be returned using the dtype of the evaluation data. The default is False. |
False |
Returns:
Type | Description |
---|---|
Tensor |
Utility scores, in a 2-dimensional PyTorch tensor. |
Source code in evotorch/core.py
@torch.no_grad()
def utils(
self,
*,
ranking_method: Optional[str] = None,
check_nans: bool = True,
using_values_dtype: bool = False,
) -> torch.Tensor:
"""
Return numeric scores for each solution, and for each objective.
Utility scores are different from evaluation results,
in the sense that utilities monotonically increase from
bad solutions to good solutions, regardless of the
objective sense.
Unlike the method called `utility(...)`, this function returns
a 2-dimensional tensor even when the problem is single-objective.
The result of this method is always a 2-dimensional tensor of
shape `(n, m)`, `n` being the number of solutions, `m` being the
number of objectives.
Args:
ranking_method: If the utility scores are to be generated
according to a certain ranking method, pass here the name
of that ranking method as a str (e.g. 'centered').
check_nans: Check for nan (not-a-number) values in the
evaluation results, which is an indication of
unevaluated solutions.
using_values_dtype: If True, the utility values will be returned
using the dtype of the decision values.
If False, the utility values will be returned using the dtype
of the evaluation data.
The default is False.
Returns:
Utility scores, in a 2-dimensional PyTorch tensor.
"""
result = self.utility(
ranking_method=ranking_method, check_nans=check_nans, using_values_dtype=using_values_dtype
)
if result.ndim == 1:
result = result.view(len(result), 1)
return result
SolutionBatchPieces (Sequence)
¶
A collection of SolutionBatch slice views.
An instance of this class behaves like a read-only collection of SolutionBatch objects (each being a sliced view of a bigger SolutionBatch).
Source code in evotorch/core.py
class SolutionBatchPieces(Sequence):
"""A collection of SolutionBatch slice views.
An instance of this class behaves like a read-only collection of
SolutionBatch objects (each being a sliced view of a bigger
SolutionBatch).
"""
@torch.no_grad()
def __init__(self, batch: SolutionBatch, *, num_pieces: Optional[int] = None, max_size: Optional[int] = None):
"""
`__init__(...)`: Initialize the SolutionBatchPieces.
Args:
batch: The SolutionBatch which will be split into
multiple SolutionBatch views.
Each view itself is a SolutionBatch object,
but not independent, meaning that any modification
done to a SolutionBatch view will reflect on this
main batch.
num_pieces: Can be provided as an integer n, which means
that the main SolutionBatch will be split to n pieces.
Alternatively, can be left as None if the user intends
to set max_size as an integer instead.
max_size: Can be provided as an integer n, which means
that the main SolutionBatch will be split to multiple
pieces, each piece containing n solutions at most.
Alternatively, can be left as None if the user intends
to set num_pieces as an integer instead.
"""
self._batch = batch
self._pieces: List[SolutionBatch] = []
self._piece_sizes: List[int] = []
self._piece_slices: List[Tuple[int, int]] = []
total_size = len(self._batch)
if max_size is None and num_pieces is not None:
num_pieces = int(num_pieces)
# divide to pieces
base_size = total_size // num_pieces
rest = total_size - (base_size * num_pieces)
self._piece_sizes = [base_size] * num_pieces
for i in range(rest):
self._piece_sizes[i] += 1
elif max_size is not None and num_pieces is None:
max_size = int(max_size)
# divide to pieces
num_pieces = math.ceil(total_size / max_size)
current_total = 0
for i in range(num_pieces):
if current_total + max_size > total_size:
self._piece_sizes.append(total_size - current_total)
else:
self._piece_sizes.append(max_size)
current_total += max_size
elif max_size is not None and num_pieces is not None:
raise ValueError("Expected either max_size or num_pieces, received both.")
elif max_size is None and num_pieces is None:
raise ValueError("Expected either max_size or num_pieces, received none.")
current_begin = 0
for size in self._piece_sizes:
current_end = current_begin + size
self._piece_slices.append((current_begin, current_end))
current_begin = current_end
for slice_begin, slice_end in self._piece_slices:
self._pieces.append(self._batch[slice_begin:slice_end])
def __len__(self) -> int:
return len(self._pieces)
def __getitem__(self, i: Union[int, slice]) -> SolutionBatch:
return self._pieces[i]
def iter_with_indices(self):
"""Iterate over each `(piece, (i_begin, i_end))`
where `piece` is a SolutionBatch view, `i_begin` is the beginning
index of the SolutionBatch view in the main batch, `j_begin` is the
ending index (exclusive) of the SolutionBatch view in the main batch.
"""
for i in range(len(self._pieces)):
yield self._pieces[i], self._piece_slices[i]
def indices_of(self, n) -> tuple:
"""Get `(i_begin, i_end)` for the n-th piece
(i.e. the n-th sliced view of the main SolutionBatch)
where `i_begin` is the beginning index of the n-th piece,
`i_end` is the (exclusive) ending index of the n-th piece.
Args:
n: Specifies the index of the queried SolutionBatch view.
Returns:
Beginning and ending indices of the SolutionBatch view,
in a tuple.
"""
return self._piece_slices[n]
@property
def batch(self) -> SolutionBatch:
"""Get the main SolutionBatch object, in its non-split form"""
return self._batch
def _to_string(self) -> str:
f = io.StringIO()
print(f"<{type(self).__name__}", file=f)
n = len(self._pieces)
for i, piece in enumerate(self._pieces):
print(f" {piece}", end="", file=f)
if (i + 1) == n:
print(file=f)
else:
print(",", file=f)
print(">", file=f)
f.seek(0)
return f.read()
def __str__(self) -> str:
return self._to_string()
def __repr__(self) -> str:
return self._to_string()
batch: SolutionBatch
property
readonly
¶
Get the main SolutionBatch object, in its non-split form
__init__(self, batch, *, num_pieces=None, max_size=None)
special
¶
__init__(...)
: Initialize the SolutionBatchPieces.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
SolutionBatch |
The SolutionBatch which will be split into multiple SolutionBatch views. Each view itself is a SolutionBatch object, but not independent, meaning that any modification done to a SolutionBatch view will reflect on this main batch. |
required |
num_pieces |
Optional[int] |
Can be provided as an integer n, which means that the main SolutionBatch will be split to n pieces. Alternatively, can be left as None if the user intends to set max_size as an integer instead. |
None |
max_size |
Optional[int] |
Can be provided as an integer n, which means that the main SolutionBatch will be split to multiple pieces, each piece containing n solutions at most. Alternatively, can be left as None if the user intends to set num_pieces as an integer instead. |
None |
Source code in evotorch/core.py
@torch.no_grad()
def __init__(self, batch: SolutionBatch, *, num_pieces: Optional[int] = None, max_size: Optional[int] = None):
"""
`__init__(...)`: Initialize the SolutionBatchPieces.
Args:
batch: The SolutionBatch which will be split into
multiple SolutionBatch views.
Each view itself is a SolutionBatch object,
but not independent, meaning that any modification
done to a SolutionBatch view will reflect on this
main batch.
num_pieces: Can be provided as an integer n, which means
that the main SolutionBatch will be split to n pieces.
Alternatively, can be left as None if the user intends
to set max_size as an integer instead.
max_size: Can be provided as an integer n, which means
that the main SolutionBatch will be split to multiple
pieces, each piece containing n solutions at most.
Alternatively, can be left as None if the user intends
to set num_pieces as an integer instead.
"""
self._batch = batch
self._pieces: List[SolutionBatch] = []
self._piece_sizes: List[int] = []
self._piece_slices: List[Tuple[int, int]] = []
total_size = len(self._batch)
if max_size is None and num_pieces is not None:
num_pieces = int(num_pieces)
# divide to pieces
base_size = total_size // num_pieces
rest = total_size - (base_size * num_pieces)
self._piece_sizes = [base_size] * num_pieces
for i in range(rest):
self._piece_sizes[i] += 1
elif max_size is not None and num_pieces is None:
max_size = int(max_size)
# divide to pieces
num_pieces = math.ceil(total_size / max_size)
current_total = 0
for i in range(num_pieces):
if current_total + max_size > total_size:
self._piece_sizes.append(total_size - current_total)
else:
self._piece_sizes.append(max_size)
current_total += max_size
elif max_size is not None and num_pieces is not None:
raise ValueError("Expected either max_size or num_pieces, received both.")
elif max_size is None and num_pieces is None:
raise ValueError("Expected either max_size or num_pieces, received none.")
current_begin = 0
for size in self._piece_sizes:
current_end = current_begin + size
self._piece_slices.append((current_begin, current_end))
current_begin = current_end
for slice_begin, slice_end in self._piece_slices:
self._pieces.append(self._batch[slice_begin:slice_end])
indices_of(self, n)
¶
Get (i_begin, i_end)
for the n-th piece
(i.e. the n-th sliced view of the main SolutionBatch)
where i_begin
is the beginning index of the n-th piece,
i_end
is the (exclusive) ending index of the n-th piece.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
Specifies the index of the queried SolutionBatch view. |
required |
Returns:
Type | Description |
---|---|
tuple |
Beginning and ending indices of the SolutionBatch view, in a tuple. |
Source code in evotorch/core.py
def indices_of(self, n) -> tuple:
"""Get `(i_begin, i_end)` for the n-th piece
(i.e. the n-th sliced view of the main SolutionBatch)
where `i_begin` is the beginning index of the n-th piece,
`i_end` is the (exclusive) ending index of the n-th piece.
Args:
n: Specifies the index of the queried SolutionBatch view.
Returns:
Beginning and ending indices of the SolutionBatch view,
in a tuple.
"""
return self._piece_slices[n]
iter_with_indices(self)
¶
Iterate over each (piece, (i_begin, i_end))
where piece
is a SolutionBatch view, i_begin
is the beginning
index of the SolutionBatch view in the main batch, j_begin
is the
ending index (exclusive) of the SolutionBatch view in the main batch.
Source code in evotorch/core.py
def iter_with_indices(self):
"""Iterate over each `(piece, (i_begin, i_end))`
where `piece` is a SolutionBatch view, `i_begin` is the beginning
index of the SolutionBatch view in the main batch, `j_begin` is the
ending index (exclusive) of the SolutionBatch view in the main batch.
"""
for i in range(len(self._pieces)):
yield self._pieces[i], self._piece_slices[i]
distributions
¶
Distribution (TensorMakerMixin)
¶
Base class for any search distribution.
Source code in evotorch/distributions.py
class Distribution(TensorMakerMixin):
"""
Base class for any search distribution.
"""
MANDATORY_PARAMETERS = set()
OPTIONAL_PARAMETERS = set()
def __init__(
self, *, solution_length: int, parameters: dict, dtype: Optional[DType] = None, device: Optional[Device] = None
):
"""
`__init__(...)`: Initialize the Distribution.
It is expected that one of these two conditions is met:
(i) the inheriting search distribution class does not implement its
own `__init__(...)` method; or
(ii) the inheriting search distribution class has its own
`__init__(...)` method, and calls `Distribution.__init__(...)`
from there, during its initialization phase.
Args:
solution_length: Expected as an integer, this argument represents
the solution length.
parameters: Expected as a dictionary, this argument stores
the parameters of the search distribution.
For example, for a Gaussian distribution where `mu`
represents the mean, and `sigma` represents the coverage
area, this dictionary would have the keys "mu" and "sigma",
and each of these keys would map to a PyTorch tensor.
dtype: The dtype of the search distribution (e.g. torch.float32).
device: The device of the search distribution (e.g. "cpu").
"""
self.__solution_length: int = int(solution_length)
self.__parameters: dict
self.__dtype: torch.dtype
self.__device: torch.device
self.__check_correctness(parameters)
cast_kwargs = {}
if dtype is not None:
cast_kwargs["dtype"] = to_torch_dtype(dtype)
if device is not None:
cast_kwargs["device"] = torch.device(device)
if len(cast_kwargs) == 0:
self.__parameters = copy(parameters)
else:
self.__parameters = cast_tensors_in_container(parameters, **cast_kwargs)
self.__dtype = cast_kwargs.get("dtype", dtype_of_container(parameters))
self.__device = cast_kwargs.get("device", device_of_container(parameters))
def __check_correctness(self, parameters: dict):
found_mandatory = 0
for param_name in parameters.keys():
if param_name in self.MANDATORY_PARAMETERS:
found_mandatory += 1
elif param_name in self.OPTIONAL_PARAMETERS:
pass # nothing to do
else:
raise ValueError(f"Unrecognized parameter: {repr(param_name)}")
if found_mandatory < len(self.MANDATORY_PARAMETERS):
raise ValueError(
f"Not all mandatory parameters of this Distribution were specified."
f" Mandatory parameters of this distribution: {self.MANDATORY_PARAMETERS};"
f" optional parameters of this distribution: {self.OPTIONAL_PARAMETERS};"
f" encountered parameters: {set(parameters.keys())}."
)
def to(self, device: Device) -> "Distribution":
"""
Bring the Distribution onto a computational device.
If the given device is already the device of this Distribution,
then the Distribution itself will be returned.
If the given device is different than the device of this
Distribution, a copy of this Distribution on the given device
will be created and returned.
Args:
device: The computation device onto which the Distribution
will be brought.
Returns:
The Distribution on the target device.
"""
if torch.device(self.device) == torch.device(device):
return self
else:
cls = self.__class__
return cls(solution_length=self.solution_length, parameters=self.parameters, device=device)
def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
"""
Fill the given tensor with samples from this search distribution.
It is expected that the inheriting search distribution class
has its own implementation for this method.
Args:
out: The PyTorch tensor that will be filled with the samples.
This tensor is expected as 2-dimensional with its number
of columns equal to the solution length declared by this
distribution.
generator: Optionally a PyTorch generator, to be used for
sampling. None means that the global generator of PyTorch
is to be used.
"""
raise NotImplementedError
def sample(
self,
num_solutions: Optional[int] = None,
*,
out: Optional[torch.Tensor] = None,
generator: Any = None,
) -> torch.Tensor:
"""
Sample solutions from this search distribution.
Args:
num_solutions: How many solutions will be sampled.
If this argument is given as an integer and the argument
`out` is left as None, then a new PyTorch tensor, filled
with the samples from this distribution, will be generated
and returned. The number of rows of this new tensor will
be equal to the given `num_solutions`.
If the argument `num_solutions` is provided as an integer,
then the argument `out` is expected as None.
out: The PyTorch tensor that will be filled with the samples
of this distribution. This tensor is expected as a
2-dimensional tensor with its number of columns equal to
the solution length declared by this distribution.
If the argument `out` is provided as a tensor, then the
argument `num_solutions` is expected as None.
generator: Optionally a PyTorch generator or any object which
has a `generator` attribute (e.g. a Problem instance).
If left as None, the global generator of PyTorch will be
used.
Returns:
A 2-dimensional PyTorch tensor which stores the sampled solutions.
"""
if (num_solutions is not None) and (out is not None):
raise ValueError(
"Received both `num_solutions` and `out` with values other than None."
"Please provide only one of these arguments with a value other than None, not both of them."
)
elif (num_solutions is not None) and (out is None):
num_solutions = int(num_solutions)
out = self.make_empty(num_solutions=num_solutions)
elif (num_solutions is None) and (out is not None):
if out.ndim != 2:
raise ValueError(
f"The `sample(...)` method can fill only 2-dimensional tensors."
f" However, the provided `out` tensor has {out.ndim} dimensions, its shape being {out.shape}."
)
_, num_cols = out.shape
if num_cols != self.solution_length:
raise ValueError(
f"The solution length declared by this distribution is {self.solution_length}."
f" However, the provided `out` tensor has {num_cols} columns."
f" The `sample(...)` method can only work with tensors whose number of columns are equal"
f" to the declared solution length."
)
else:
raise ValueError(
"Received both `num_solutions` and `out` as None."
"Please provide one of these arguments with a value other than None."
)
self._fill(out, generator=generator)
return out
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
"""
Compute the gradients out of the samples (sampled solutions)
and weights (i.e. weights or ranks of the solutions, better
solutions having numerically higher weights).
It is expected that the inheriting class implements this method.
Args:
samples: The sampled solutions, as a 2-dimensional tensor.
weights: Solution weights, as a 1-dimensional tensor of length
`n`, `n` being the number of sampled solutions.
ranking_used: Ranking that was used to obtain the weights.
Returns:
The gradient(s) in a dictionary.
"""
raise NotImplementedError
def compute_gradients(
self,
samples: torch.Tensor,
fitnesses: torch.Tensor,
*,
objective_sense: str,
ranking_method: Optional[str] = None,
) -> dict:
"""
Compute and return gradients.
Args:
samples: The solutions that were sampled from this Distribution.
The tensor passed via this argument is expected to have
the same dtype and device with this Distribution.
fitnesses: The evaluation results of the sampled solutions.
If fitnesses are given with a different dtype (maybe because
the eval_dtype of the Problem object is different than its
decision variable dtype), then this method will first
create an internal copy of the fitnesses with the correct
dtype, and then will use those copied fitnesses for
computing the gradients.
objective_sense: The objective sense, expected as "min" or "max".
In the case of "min", lower fitness values will be regarded
as better (therefore, in this case, one can alternatively
refer to fitnesses as 'unfitnesses' or 'solution costs').
In the case of "max", higher fitness values will be regarded
as better.
ranking_method: The ranking method to be used.
Can be: "linear" (where ranks linearly go from 0 to 1);
"centered" (where ranks linearly go from -0.5 to +0.5);
"normalized" (where the standard-normalized fitnesses
serve as ranks); or "raw" (where the fitnesses themselves
serve as ranks).
The default is "raw".
Returns:
A dictionary which contains the gradient for each parameter of the
distribution.
"""
if objective_sense == "max":
higher_is_better = True
elif objective_sense == "min":
higher_is_better = False
else:
raise ValueError(
f'`objective_sense` was expected as "min" or as "max".'
f" However, it was encountered as {repr(objective_sense)}."
)
if ranking_method is None:
ranking_method = "raw"
# Make sure that the fitnesses are in the correct dtype
fitnesses = torch.as_tensor(fitnesses, dtype=self.dtype)
[num_samples, _] = samples.shape
[num_fitnesses] = fitnesses.shape
if num_samples != num_fitnesses:
raise ValueError(
f"The number of samples and the number of fitnesses do not match:" f" {num_samples} != {num_fitnesses}."
)
weights = rank(fitnesses, ranking_method=ranking_method, higher_is_better=higher_is_better)
return self._compute_gradients(samples, weights, ranking_method)
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "Distribution":
"""
Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation
for this method.
Args:
gradients: Gradients, as a dictionary, which will be used for
computing the necessary updates.
learning_rates: A dictionary which contains learning rates
for parameters that will be updated using a learning rate
coefficient.
optimizers: A dictionary which contains optimizer objects
for parameters that will be updated using an adaptive
optimizer.
Returns:
The updated copy of the distribution.
"""
raise NotImplementedError
def modified_copy(
self, *, dtype: Optional[DType] = None, device: Optional[Device] = None, **parameters
) -> "Distribution":
"""
Return a modified copy of this distribution.
Args:
dtype: The new dtype of the distribution.
device: The new device of the distribution.
parameters: Expected in the form of extra keyword arguments.
Each of these keyword arguments will cause the new distribution
to have a modified value for the specified parameter.
Returns:
The modified copy of the distribution.
"""
cls = self.__class__
if device is None:
device = self.device
if dtype is None:
dtype = self.dtype
new_parameters = copy(self.parameters)
new_parameters.update(parameters)
return cls(parameters=new_parameters, dtype=dtype, device=device)
def relative_entropy(dist_0: "Distribution", dist_1: "Distribution") -> float:
raise NotImplementedError
@property
def solution_length(self) -> int:
return self.__solution_length
@property
def device(self) -> torch.device:
return self.__device
@property
def dtype(self) -> torch.dtype:
return self.__dtype
@property
def parameters(self) -> dict:
return self.__parameters
def _follow_gradient(
self,
param_name: str,
x: torch.Tensor,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> torch.Tensor:
x = torch.as_tensor(x, dtype=self.dtype, device=self.device)
learning_rate, optimizer = self._get_learning_rate_and_optimizer(param_name, learning_rates, optimizers)
if (learning_rate is None) and (optimizer is None):
return x
elif (learning_rate is not None) and (optimizer is None):
return learning_rate * x
elif (learning_rate is None) and (optimizer is not None):
return optimizer.ascent(x)
else:
raise ValueError(
"Encountered both `learning_rate` and `optimizer` as values other than None."
" This method can only work if both of them are None or only one of them is not None."
)
@staticmethod
def _get_learning_rate_and_optimizer(
param_name: str, learning_rates: Optional[dict], optimizers: Optional[dict]
) -> tuple:
if learning_rates is None:
learning_rates = {}
if optimizers is None:
optimizers = {}
return learning_rates.get(param_name, None), optimizers.get(param_name, None)
__init__(self, *, solution_length, parameters, dtype=None, device=None)
special
¶
__init__(...)
: Initialize the Distribution.
It is expected that one of these two conditions is met:
(i) the inheriting search distribution class does not implement its
own __init__(...)
method; or
(ii) the inheriting search distribution class has its own
__init__(...)
method, and calls Distribution.__init__(...)
from there, during its initialization phase.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solution_length |
int |
Expected as an integer, this argument represents the solution length. |
required |
parameters |
dict |
Expected as a dictionary, this argument stores
the parameters of the search distribution.
For example, for a Gaussian distribution where |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype of the search distribution (e.g. torch.float32). |
None |
device |
Union[str, torch.device] |
The device of the search distribution (e.g. "cpu"). |
None |
Source code in evotorch/distributions.py
def __init__(
self, *, solution_length: int, parameters: dict, dtype: Optional[DType] = None, device: Optional[Device] = None
):
"""
`__init__(...)`: Initialize the Distribution.
It is expected that one of these two conditions is met:
(i) the inheriting search distribution class does not implement its
own `__init__(...)` method; or
(ii) the inheriting search distribution class has its own
`__init__(...)` method, and calls `Distribution.__init__(...)`
from there, during its initialization phase.
Args:
solution_length: Expected as an integer, this argument represents
the solution length.
parameters: Expected as a dictionary, this argument stores
the parameters of the search distribution.
For example, for a Gaussian distribution where `mu`
represents the mean, and `sigma` represents the coverage
area, this dictionary would have the keys "mu" and "sigma",
and each of these keys would map to a PyTorch tensor.
dtype: The dtype of the search distribution (e.g. torch.float32).
device: The device of the search distribution (e.g. "cpu").
"""
self.__solution_length: int = int(solution_length)
self.__parameters: dict
self.__dtype: torch.dtype
self.__device: torch.device
self.__check_correctness(parameters)
cast_kwargs = {}
if dtype is not None:
cast_kwargs["dtype"] = to_torch_dtype(dtype)
if device is not None:
cast_kwargs["device"] = torch.device(device)
if len(cast_kwargs) == 0:
self.__parameters = copy(parameters)
else:
self.__parameters = cast_tensors_in_container(parameters, **cast_kwargs)
self.__dtype = cast_kwargs.get("dtype", dtype_of_container(parameters))
self.__device = cast_kwargs.get("device", device_of_container(parameters))
compute_gradients(self, samples, fitnesses, *, objective_sense, ranking_method=None)
¶
Compute and return gradients.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
samples |
Tensor |
The solutions that were sampled from this Distribution. The tensor passed via this argument is expected to have the same dtype and device with this Distribution. |
required |
fitnesses |
Tensor |
The evaluation results of the sampled solutions. If fitnesses are given with a different dtype (maybe because the eval_dtype of the Problem object is different than its decision variable dtype), then this method will first create an internal copy of the fitnesses with the correct dtype, and then will use those copied fitnesses for computing the gradients. |
required |
objective_sense |
str |
The objective sense, expected as "min" or "max". In the case of "min", lower fitness values will be regarded as better (therefore, in this case, one can alternatively refer to fitnesses as 'unfitnesses' or 'solution costs'). In the case of "max", higher fitness values will be regarded as better. |
required |
ranking_method |
Optional[str] |
The ranking method to be used. Can be: "linear" (where ranks linearly go from 0 to 1); "centered" (where ranks linearly go from -0.5 to +0.5); "normalized" (where the standard-normalized fitnesses serve as ranks); or "raw" (where the fitnesses themselves serve as ranks). The default is "raw". |
None |
Returns:
Type | Description |
---|---|
dict |
A dictionary which contains the gradient for each parameter of the distribution. |
Source code in evotorch/distributions.py
def compute_gradients(
self,
samples: torch.Tensor,
fitnesses: torch.Tensor,
*,
objective_sense: str,
ranking_method: Optional[str] = None,
) -> dict:
"""
Compute and return gradients.
Args:
samples: The solutions that were sampled from this Distribution.
The tensor passed via this argument is expected to have
the same dtype and device with this Distribution.
fitnesses: The evaluation results of the sampled solutions.
If fitnesses are given with a different dtype (maybe because
the eval_dtype of the Problem object is different than its
decision variable dtype), then this method will first
create an internal copy of the fitnesses with the correct
dtype, and then will use those copied fitnesses for
computing the gradients.
objective_sense: The objective sense, expected as "min" or "max".
In the case of "min", lower fitness values will be regarded
as better (therefore, in this case, one can alternatively
refer to fitnesses as 'unfitnesses' or 'solution costs').
In the case of "max", higher fitness values will be regarded
as better.
ranking_method: The ranking method to be used.
Can be: "linear" (where ranks linearly go from 0 to 1);
"centered" (where ranks linearly go from -0.5 to +0.5);
"normalized" (where the standard-normalized fitnesses
serve as ranks); or "raw" (where the fitnesses themselves
serve as ranks).
The default is "raw".
Returns:
A dictionary which contains the gradient for each parameter of the
distribution.
"""
if objective_sense == "max":
higher_is_better = True
elif objective_sense == "min":
higher_is_better = False
else:
raise ValueError(
f'`objective_sense` was expected as "min" or as "max".'
f" However, it was encountered as {repr(objective_sense)}."
)
if ranking_method is None:
ranking_method = "raw"
# Make sure that the fitnesses are in the correct dtype
fitnesses = torch.as_tensor(fitnesses, dtype=self.dtype)
[num_samples, _] = samples.shape
[num_fitnesses] = fitnesses.shape
if num_samples != num_fitnesses:
raise ValueError(
f"The number of samples and the number of fitnesses do not match:" f" {num_samples} != {num_fitnesses}."
)
weights = rank(fitnesses, ranking_method=ranking_method, higher_is_better=higher_is_better)
return self._compute_gradients(samples, weights, ranking_method)
modified_copy(self, *, dtype=None, device=None, **parameters)
¶
Return a modified copy of this distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The new dtype of the distribution. |
None |
device |
Union[str, torch.device] |
The new device of the distribution. |
None |
parameters |
Expected in the form of extra keyword arguments. Each of these keyword arguments will cause the new distribution to have a modified value for the specified parameter. |
{} |
Returns:
Type | Description |
---|---|
Distribution |
The modified copy of the distribution. |
Source code in evotorch/distributions.py
def modified_copy(
self, *, dtype: Optional[DType] = None, device: Optional[Device] = None, **parameters
) -> "Distribution":
"""
Return a modified copy of this distribution.
Args:
dtype: The new dtype of the distribution.
device: The new device of the distribution.
parameters: Expected in the form of extra keyword arguments.
Each of these keyword arguments will cause the new distribution
to have a modified value for the specified parameter.
Returns:
The modified copy of the distribution.
"""
cls = self.__class__
if device is None:
device = self.device
if dtype is None:
dtype = self.dtype
new_parameters = copy(self.parameters)
new_parameters.update(parameters)
return cls(parameters=new_parameters, dtype=dtype, device=device)
sample(self, num_solutions=None, *, out=None, generator=None)
¶
Sample solutions from this search distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_solutions |
Optional[int] |
How many solutions will be sampled.
If this argument is given as an integer and the argument
|
None |
out |
Optional[torch.Tensor] |
The PyTorch tensor that will be filled with the samples
of this distribution. This tensor is expected as a
2-dimensional tensor with its number of columns equal to
the solution length declared by this distribution.
If the argument |
None |
generator |
Any |
Optionally a PyTorch generator or any object which
has a |
None |
Returns:
Type | Description |
---|---|
Tensor |
A 2-dimensional PyTorch tensor which stores the sampled solutions. |
Source code in evotorch/distributions.py
def sample(
self,
num_solutions: Optional[int] = None,
*,
out: Optional[torch.Tensor] = None,
generator: Any = None,
) -> torch.Tensor:
"""
Sample solutions from this search distribution.
Args:
num_solutions: How many solutions will be sampled.
If this argument is given as an integer and the argument
`out` is left as None, then a new PyTorch tensor, filled
with the samples from this distribution, will be generated
and returned. The number of rows of this new tensor will
be equal to the given `num_solutions`.
If the argument `num_solutions` is provided as an integer,
then the argument `out` is expected as None.
out: The PyTorch tensor that will be filled with the samples
of this distribution. This tensor is expected as a
2-dimensional tensor with its number of columns equal to
the solution length declared by this distribution.
If the argument `out` is provided as a tensor, then the
argument `num_solutions` is expected as None.
generator: Optionally a PyTorch generator or any object which
has a `generator` attribute (e.g. a Problem instance).
If left as None, the global generator of PyTorch will be
used.
Returns:
A 2-dimensional PyTorch tensor which stores the sampled solutions.
"""
if (num_solutions is not None) and (out is not None):
raise ValueError(
"Received both `num_solutions` and `out` with values other than None."
"Please provide only one of these arguments with a value other than None, not both of them."
)
elif (num_solutions is not None) and (out is None):
num_solutions = int(num_solutions)
out = self.make_empty(num_solutions=num_solutions)
elif (num_solutions is None) and (out is not None):
if out.ndim != 2:
raise ValueError(
f"The `sample(...)` method can fill only 2-dimensional tensors."
f" However, the provided `out` tensor has {out.ndim} dimensions, its shape being {out.shape}."
)
_, num_cols = out.shape
if num_cols != self.solution_length:
raise ValueError(
f"The solution length declared by this distribution is {self.solution_length}."
f" However, the provided `out` tensor has {num_cols} columns."
f" The `sample(...)` method can only work with tensors whose number of columns are equal"
f" to the declared solution length."
)
else:
raise ValueError(
"Received both `num_solutions` and `out` as None."
"Please provide one of these arguments with a value other than None."
)
self._fill(out, generator=generator)
return out
to(self, device)
¶
Bring the Distribution onto a computational device.
If the given device is already the device of this Distribution, then the Distribution itself will be returned. If the given device is different than the device of this Distribution, a copy of this Distribution on the given device will be created and returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device |
Union[str, torch.device] |
The computation device onto which the Distribution will be brought. |
required |
Returns:
Type | Description |
---|---|
Distribution |
The Distribution on the target device. |
Source code in evotorch/distributions.py
def to(self, device: Device) -> "Distribution":
"""
Bring the Distribution onto a computational device.
If the given device is already the device of this Distribution,
then the Distribution itself will be returned.
If the given device is different than the device of this
Distribution, a copy of this Distribution on the given device
will be created and returned.
Args:
device: The computation device onto which the Distribution
will be brought.
Returns:
The Distribution on the target device.
"""
if torch.device(self.device) == torch.device(device):
return self
else:
cls = self.__class__
return cls(solution_length=self.solution_length, parameters=self.parameters, device=device)
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶
Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
Distribution |
The updated copy of the distribution. |
Source code in evotorch/distributions.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "Distribution":
"""
Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation
for this method.
Args:
gradients: Gradients, as a dictionary, which will be used for
computing the necessary updates.
learning_rates: A dictionary which contains learning rates
for parameters that will be updated using a learning rate
coefficient.
optimizers: A dictionary which contains optimizer objects
for parameters that will be updated using an adaptive
optimizer.
Returns:
The updated copy of the distribution.
"""
raise NotImplementedError
ExpGaussian (Distribution)
¶
exponential Multivariate Gaussian, as used by XNES
Source code in evotorch/distributions.py
class ExpGaussian(Distribution):
"""exponential Multivariate Gaussian, as used by XNES"""
# Corresponding to mu and A in symbols used in xNES paper
MANDATORY_PARAMETERS = {"mu", "sigma"}
# Inverse of sigma, numerically more stable to track this independently to sigma
OPTIONAL_PARAMETERS = {"sigma_inv"}
def __init__(
self,
parameters: dict,
*,
solution_length: Optional[int] = None,
device: Optional[Device] = None,
dtype: Optional[DType] = None,
):
[mu_length] = parameters["mu"].shape
# Make sigma 2D
if len(parameters["sigma"].shape) == 1:
parameters["sigma"] = torch.diag(parameters["sigma"])
# Automatically generate sigma_inv if not provided
if "sigma_inv" not in parameters:
parameters["sigma_inv"] = torch.inverse(parameters["sigma"])
[sigma_length, _] = parameters["sigma"].shape
if solution_length is None:
solution_length = mu_length
else:
if solution_length != mu_length:
raise ValueError(
f"The argument `solution_length` does not match the length of `mu` provided in `parameters`."
f" solution_length={solution_length},"
f' parameters["mu"]={mu_length}.'
)
if mu_length != sigma_length:
raise ValueError(
f"The tensors `mu` and `sigma` provided within `parameters` have mismatching lengths."
f' parameters["mu"]={mu_length},'
f' parameters["sigma"]={sigma_length}.'
)
super().__init__(
solution_length=solution_length,
parameters=parameters,
device=device,
dtype=dtype,
)
# Make identity matrix as this is used throughout in gradient computation
self.eye = self.make_zeros((solution_length, solution_length))
self.eye[range(self.solution_length), range(self.solution_length)] = 1.0
@property
def mu(self) -> torch.Tensor:
"""Getter for mu
Returns:
mu (torch.Tensor): The center of the search distribution
"""
return self.parameters["mu"]
@mu.setter
def mu(self, new_mu: Iterable):
"""Setter for mu
Args:
new_mu (torch.Tensor): The new value of mu
"""
self.parameters["mu"] = torch.as_tensor(new_mu, dtype=self.dtype, device=self.device)
@property
def cov(self) -> torch.Tensor:
"""The covariance matrix A^T A"""
return self.sigma.transpose(0, 1) @ self.sigma
@property
def sigma(self) -> torch.Tensor:
"""Getter for sigma
Returns:
sigma (torch.Tensor): The square root of the covariance matrix
"""
return self.parameters["sigma"]
@property
def sigma_inv(self) -> torch.Tensor:
"""Getter for sigma_inv
Returns:
sigma_inv (torch.Tensor): The inverse square root of the covariance matrix
"""
if "sigma_inv" in self.parameters:
return self.parameters["sigma_inv"]
else:
return torch.inverse(self.parameters["sigma"])
@property
def A(self) -> torch.Tensor:
"""Alias for self.sigma, for notational consistency with paper"""
return self.sigma
@property
def A_inv(self) -> torch.Tensor:
"""Alias for self.sigma_inv, for notational consistency with paper"""
return self.sigma_inv
@sigma.setter
def sigma(self, new_sigma: Iterable):
"""Setter for sigma
Args:
new_sigma (torch.Tensor): The new value of sigma, the square root of the covariance matrix
"""
self.parameters["sigma"] = torch.as_tensor(new_sigma, dtype=self.dtype, device=self.device)
def to_global_coordinates(self, local_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from local coordinate space N(0, I_d) to global coordinate space N(mu, A^T A)
This function is the inverse of to_local_coordinates
Args:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
Returns:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# We use transpose here to simplify the batched application of A
return self.mu.unsqueeze(0) + (self.A @ local_coordinates.T).T
def to_local_coordinates(self, global_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from global coordinate space N(mu, A^T A) to local coordinate space N(0, I_d)
This function is the inverse of to_global_coordinates
Args:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
Returns:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# Therefore, we can recover z according to z = A_inv (x - mu)
return (self.A_inv @ (global_coordinates - self.mu.unsqueeze(0)).T).T
def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
"""Fill a tensor with samples from N(mu, A^T A)
Args:
out (torch.Tensor): The tensor to fill
generator (Optional[torch.Generator]): A generator to use to generate random values
"""
# Fill with local coordinates from N(0, I_d)
self.make_gaussian(out=out, generator=generator)
# Map local coordinates to global coordinate system
out[:] = self.to_global_coordinates(out)
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
"""Compute the gradients with respect to a given set of samples and weights
Args:
samples (torch.Tensor): Samples drawn from N(mu, A^T A), ideally using self._fill
weights (torch.Tensor): Weights e.g. fitnesses or utilities assigned to samples
ranking_used (optional[str]): The ranking method used to compute weights
Returns:
grads (dict): A dictionary containing the approximated natural gradient on d and M
"""
# Compute the local coordinates
local_coordinates = self.to_local_coordinates(samples)
# Make sure that the weights (utilities) are 0-centered
# (Otherwise the formulations would have to consider a bias term)
if ranking_used not in ("centered", "normalized"):
weights = weights - torch.mean(weights)
d_grad = total(dot(weights, local_coordinates))
local_coordinates_outer = local_coordinates.unsqueeze(1) * local_coordinates.unsqueeze(2)
M_grad = torch.sum(
weights.unsqueeze(-1).unsqueeze(-1) * (local_coordinates_outer - self.eye.unsqueeze(0)), dim=0
)
return {
"d": d_grad,
"M": M_grad,
}
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpGaussian":
d_grad = gradients["d"]
M_grad = gradients["M"]
if "d" not in learning_rates:
learning_rates["d"] = learning_rates["mu"]
if "M" not in learning_rates:
learning_rates["M"] = learning_rates["sigma"]
# Follow gradients for d, and M
update_d = self._follow_gradient("d", d_grad, learning_rates=learning_rates, optimizers=optimizers)
update_M = self._follow_gradient("M", M_grad, learning_rates=learning_rates, optimizers=optimizers)
# Fold into parameters mu, A and A inv
new_mu = self.mu + torch.mv(self.A, update_d)
new_A = self.A @ torch.matrix_exp(0.5 * update_M)
new_A_inv = torch.matrix_exp(-0.5 * update_M) @ self.A_inv
# Return modified distribution
return self.modified_copy(mu=new_mu, sigma=new_A, sigma_inv=new_A_inv)
A: Tensor
property
readonly
¶
Alias for self.sigma, for notational consistency with paper
A_inv: Tensor
property
readonly
¶
Alias for self.sigma_inv, for notational consistency with paper
cov: Tensor
property
readonly
¶
The covariance matrix A^T A
mu: Tensor
property
writable
¶
Getter for mu
Returns:
Type | Description |
---|---|
mu (torch.Tensor) |
The center of the search distribution |
sigma: Tensor
property
writable
¶
Getter for sigma
Returns:
Type | Description |
---|---|
sigma (torch.Tensor) |
The square root of the covariance matrix |
sigma_inv: Tensor
property
readonly
¶
Getter for sigma_inv
Returns:
Type | Description |
---|---|
sigma_inv (torch.Tensor) |
The inverse square root of the covariance matrix |
to_global_coordinates(self, local_coordinates)
¶
Map samples from local coordinate space N(0, I_d) to global coordinate space N(mu, A^T A) This function is the inverse of to_local_coordinates
Parameters:
Name | Type | Description | Default |
---|---|---|---|
local_coordinates |
torch.Tensor |
The local coordinates sampled from N(0, I_d) |
required |
Returns:
Type | Description |
---|---|
global_coordinates (torch.Tensor) |
The global coordinates sampled from N(mu, A^T A) |
Source code in evotorch/distributions.py
def to_global_coordinates(self, local_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from local coordinate space N(0, I_d) to global coordinate space N(mu, A^T A)
This function is the inverse of to_local_coordinates
Args:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
Returns:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# We use transpose here to simplify the batched application of A
return self.mu.unsqueeze(0) + (self.A @ local_coordinates.T).T
to_local_coordinates(self, global_coordinates)
¶
Map samples from global coordinate space N(mu, A^T A) to local coordinate space N(0, I_d) This function is the inverse of to_global_coordinates
Parameters:
Name | Type | Description | Default |
---|---|---|---|
global_coordinates |
torch.Tensor |
The global coordinates sampled from N(mu, A^T A) |
required |
Returns:
Type | Description |
---|---|
local_coordinates (torch.Tensor) |
The local coordinates sampled from N(0, I_d) |
Source code in evotorch/distributions.py
def to_local_coordinates(self, global_coordinates: torch.Tensor) -> torch.Tensor:
"""Map samples from global coordinate space N(mu, A^T A) to local coordinate space N(0, I_d)
This function is the inverse of to_global_coordinates
Args:
global_coordinates (torch.Tensor): The global coordinates sampled from N(mu, A^T A)
Returns:
local_coordinates (torch.Tensor): The local coordinates sampled from N(0, I_d)
"""
# Global samples are constructed as x = mu + A z where z is local coordinate
# Therefore, we can recover z according to z = A_inv (x - mu)
return (self.A_inv @ (global_coordinates - self.mu.unsqueeze(0)).T).T
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶
Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
ExpGaussian |
The updated copy of the distribution. |
Source code in evotorch/distributions.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpGaussian":
d_grad = gradients["d"]
M_grad = gradients["M"]
if "d" not in learning_rates:
learning_rates["d"] = learning_rates["mu"]
if "M" not in learning_rates:
learning_rates["M"] = learning_rates["sigma"]
# Follow gradients for d, and M
update_d = self._follow_gradient("d", d_grad, learning_rates=learning_rates, optimizers=optimizers)
update_M = self._follow_gradient("M", M_grad, learning_rates=learning_rates, optimizers=optimizers)
# Fold into parameters mu, A and A inv
new_mu = self.mu + torch.mv(self.A, update_d)
new_A = self.A @ torch.matrix_exp(0.5 * update_M)
new_A_inv = torch.matrix_exp(-0.5 * update_M) @ self.A_inv
# Return modified distribution
return self.modified_copy(mu=new_mu, sigma=new_A, sigma_inv=new_A_inv)
ExpSeparableGaussian (SeparableGaussian)
¶
exponentialseparable Multivariate Gaussian, as used by SNES
Source code in evotorch/distributions.py
class ExpSeparableGaussian(SeparableGaussian):
"""exponentialseparable Multivariate Gaussian, as used by SNES"""
MANDATORY_PARAMETERS = {"mu", "sigma"}
OPTIONAL_PARAMETERS = set()
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
if ranking_used != "nes":
weights = weights / torch.sum(torch.abs(weights))
scaled_noises = samples - self.mu
raw_noises = scaled_noises / self.sigma
mu_grad = total(dot(weights, scaled_noises))
sigma_grad = total(dot(weights, (raw_noises**2) - 1))
return {"mu": mu_grad, "sigma": sigma_grad}
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpSeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma * torch.exp(
0.5 * self._follow_gradient("sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers)
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶
Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
ExpSeparableGaussian |
The updated copy of the distribution. |
Source code in evotorch/distributions.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "ExpSeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma * torch.exp(
0.5 * self._follow_gradient("sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers)
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
SeparableGaussian (Distribution)
¶
Separable Multivariate Gaussian, as used by PGPE
Source code in evotorch/distributions.py
class SeparableGaussian(Distribution):
"""Separable Multivariate Gaussian, as used by PGPE"""
MANDATORY_PARAMETERS = {"mu", "sigma"}
OPTIONAL_PARAMETERS = {"divide_mu_grad_by", "divide_sigma_grad_by", "parenthood_ratio"}
def __init__(
self,
parameters: dict,
*,
solution_length: Optional[int] = None,
device: Optional[Device] = None,
dtype: Optional[DType] = None,
):
[mu_length] = parameters["mu"].shape
[sigma_length] = parameters["sigma"].shape
if solution_length is None:
solution_length = mu_length
else:
if solution_length != mu_length:
raise ValueError(
f"The argument `solution_length` does not match the length of `mu` provided in `parameters`."
f" solution_length={solution_length},"
f' parameters["mu"]={mu_length}.'
)
if mu_length != sigma_length:
raise ValueError(
f"The tensors `mu` and `sigma` provided within `parameters` have mismatching lengths."
f' parameters["mu"]={mu_length},'
f' parameters["sigma"]={sigma_length}.'
)
super().__init__(
solution_length=solution_length,
parameters=parameters,
device=device,
dtype=dtype,
)
@property
def mu(self) -> torch.Tensor:
return self.parameters["mu"]
@mu.setter
def mu(self, new_mu: Iterable):
self.parameters["mu"] = torch.as_tensor(new_mu, dtype=self.dtype, device=self.device)
@property
def sigma(self) -> torch.Tensor:
return self.parameters["sigma"]
@sigma.setter
def sigma(self, new_sigma: Iterable):
self.parameters["sigma"] = torch.as_tensor(new_sigma, dtype=self.dtype, device=self.device)
def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
self.make_gaussian(out=out, center=self.mu, stdev=self.sigma, generator=generator)
def _divide_grad(self, param_name: str, grad: torch.Tensor, weights: torch.Tensor) -> torch.Tensor:
option = f"divide_{param_name}_grad_by"
if option in self.parameters:
div_by_what = self.parameters[option]
if div_by_what == "num_solutions":
[num_solutions] = weights.shape
grad = grad / num_solutions
elif div_by_what == "num_directions":
[num_solutions] = weights.shape
num_directions = num_solutions // 2
grad = grad / num_directions
elif div_by_what == "total_weight":
total_weight = torch.sum(torch.abs(weights))
grad = grad / total_weight
elif div_by_what == "weight_stdev":
weight_stdev = torch.std(weights)
grad = grad / weight_stdev
else:
raise ValueError(f"The parameter {option} has an unrecognized value: {div_by_what}")
return grad
def _compute_gradients_via_parenthood_ratio(self, samples: torch.Tensor, weights: torch.Tensor) -> dict:
[num_samples, _] = samples.shape
num_elites = math.floor(num_samples * self.parameters["parenthood_ratio"])
elite_indices = weights.argsort(descending=True)[:num_elites]
elites = samples[elite_indices, :]
return {
"mu": torch.mean(elites, dim=0) - self.parameters["mu"],
"sigma": torch.std(elites, dim=0) - self.parameters["sigma"],
}
def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
if "parenthood_ratio" in self.parameters:
return self._compute_gradients_via_parenthood_ratio(samples, weights)
else:
mu = self.mu
sigma = self.sigma
# Compute the scaled noises, that is, the noise vectors which
# were used for generating the solutions
# (solution = scaled_noise + center)
scaled_noises = samples - mu
# Make sure that the weights (utilities) are 0-centered
# (Otherwise the formulations would have to consider a bias term)
if ranking_used not in ("centered", "normalized"):
weights = weights - torch.mean(weights)
mu_grad = self._divide_grad(
"mu",
total(dot(weights, scaled_noises)),
weights,
)
sigma_grad = self._divide_grad(
"sigma",
total(dot(weights, ((scaled_noises**2) - (sigma**2)) / sigma)),
weights,
)
return {
"mu": mu_grad,
"sigma": sigma_grad,
}
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "SeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma + self._follow_gradient(
"sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
def relative_entropy(dist_0: "SeparableGaussian", dist_1: "SeparableGaussian") -> float:
mu_0 = dist_0.parameters["mu"]
mu_1 = dist_1.parameters["mu"]
sigma_0 = dist_0.parameters["sigma"]
sigma_1 = dist_1.parameters["sigma"]
cov_0 = sigma_0.pow(2.0)
cov_1 = sigma_1.pow(2.0)
mu_delta = mu_1 - mu_0
trace_cov = torch.sum(cov_0 / cov_1)
k = dist_0.solution_length
scaled_mu = torch.sum(mu_delta.pow(2.0) / cov_1)
log_det = torch.sum(torch.log(cov_1)) - torch.sum(torch.log(cov_0))
return 0.5 * (trace_cov - k + scaled_mu + log_det)
update_parameters(self, gradients, *, learning_rates=None, optimizers=None)
¶
Do an update on the distribution by following the given gradients.
It is expected that the inheriting class has its own implementation for this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gradients |
dict |
Gradients, as a dictionary, which will be used for computing the necessary updates. |
required |
learning_rates |
Optional[dict] |
A dictionary which contains learning rates for parameters that will be updated using a learning rate coefficient. |
None |
optimizers |
Optional[dict] |
A dictionary which contains optimizer objects for parameters that will be updated using an adaptive optimizer. |
None |
Returns:
Type | Description |
---|---|
SeparableGaussian |
The updated copy of the distribution. |
Source code in evotorch/distributions.py
def update_parameters(
self,
gradients: dict,
*,
learning_rates: Optional[dict] = None,
optimizers: Optional[dict] = None,
) -> "SeparableGaussian":
mu_grad = gradients["mu"]
sigma_grad = gradients["sigma"]
new_mu = self.mu + self._follow_gradient("mu", mu_grad, learning_rates=learning_rates, optimizers=optimizers)
new_sigma = self.sigma + self._follow_gradient(
"sigma", sigma_grad, learning_rates=learning_rates, optimizers=optimizers
)
return self.modified_copy(mu=new_mu, sigma=new_sigma)
SymmetricSeparableGaussian (SeparableGaussian)
¶
Symmetric (antithetic) separable Gaussian distribution as used by PGPE.
Source code in evotorch/distributions.py
class SymmetricSeparableGaussian(SeparableGaussian):
"""
Symmetric (antithetic) separable Gaussian distribution
as used by PGPE.
"""
MANDATORY_PARAMETERS = {"mu", "sigma"}
OPTIONAL_PARAMETERS = {"divide_mu_grad_by", "divide_sigma_grad_by", "parenthood_ratio"}
def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
self.make_gaussian(out=out, center=self.mu, stdev=self.sigma, symmetric=True, generator=generator)
def _compute_gradients(
self,
samples: torch.Tensor,
weights: torch.Tensor,
ranking_used: Optional[str],
) -> dict:
if "parenthood_ratio" in self.parameters:
return self._compute_gradients_via_parenthood_ratio(samples, weights)
else:
mu = self.mu
sigma = self.sigma
# Make sure that the weights (utilities) are 0-centered
# (Otherwise the formulations would have to consider a bias term)
if ranking_used not in ("centered", "normalized"):
weights = weights - torch.mean(weights)
[nslns] = weights.shape
# ndirs = nslns // 2
# Compute the scaled noises, that is, the noise vectors which
# were used for generating the solutions
# (solution = scaled_noise + center)
scaled_noises = samples[0::2] - mu
# Separate the plus and the minus ends of the directions
fdplus = weights[0::2]
fdminus = weights[1::2]
# Considering that the population is stored like this:
# _
# solution0: center + scaled_noise0 \
# > direction0
# solution1: center - scaled_noise0 _/
# _
# solution2: center + scaled_noise1 \
# > direction1
# solution3: center - scaled_noise1 _/
#
# ...
# fdplus[0] becomes the utility of the plus end of direction0
# (i.e. utility of solution0)
# fdminus[0] becomes the utility of the minus end of direction0
# (i.e. utility of solution1)
# fdplus[1] becomes the utility of the plus end of direction1
# (i.e. utility of solution2)
# fdminus[1] becomes the utility of the minus end of direction1
# (i.e. utility of solution3)
# ... and so on...
grad_mu = self._divide_grad("mu", total(dot((fdplus - fdminus) / 2, scaled_noises)), weights)
grad_sigma = self._divide_grad(
"sigma",
total(dot(((fdplus + fdminus) / 2), ((scaled_noises**2) - (sigma**2)) / sigma)),
weights,
)
return {
"mu": grad_mu,
"sigma": grad_sigma,
}
logging
¶
This module contains logging utilities.
Logger
¶
Base class for all logging classes.
Source code in evotorch/logging.py
class Logger:
"""Base class for all logging classes."""
def __init__(self, searcher: SearchAlgorithm, *, interval: int = 1, after_first_step: bool = False):
"""`__init__(...)`: Initialize the Logger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and so on.
On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and so
on.
"""
searcher.log_hook.append(self)
self._interval = int(interval)
self._after_first_step = bool(after_first_step)
self._steps_count = 0
def __call__(self, status: dict):
if self._after_first_step:
n = self._steps_count
self._steps_count += 1
else:
self._steps_count += 1
n = self._steps_count
if (n % self._interval) == 0:
self._log(self._filter(status))
def _filter(self, status: dict) -> dict:
return status
def _log(self, status: dict):
raise NotImplementedError
__init__(self, searcher, *, interval=1, after_first_step=False)
special
¶
__init__(...)
: Initialize the Logger.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
searcher |
SearchAlgorithm |
The evolutionary algorithm instance whose progress is to be logged. |
required |
interval |
int |
Expected as an integer n. Logging is to be done at every n iterations. |
1 |
after_first_step |
bool |
Expected as a boolean. Meaningful only if interval is set as an integer greater than 1. Let us suppose that interval is set as 10. If after_first_step is False (which is the default), then the logging will be done at steps 10, 20, 30, and so on. On the other hand, if after_first_step is True, then the logging will be done at steps 1, 11, 21, 31, and so on. |
False |
Source code in evotorch/logging.py
def __init__(self, searcher: SearchAlgorithm, *, interval: int = 1, after_first_step: bool = False):
"""`__init__(...)`: Initialize the Logger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and so on.
On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and so
on.
"""
searcher.log_hook.append(self)
self._interval = int(interval)
self._after_first_step = bool(after_first_step)
self._steps_count = 0
MlflowLogger (ScalarLogger)
¶
A logger which stores the status via Mlflow.
Source code in evotorch/logging.py
class MlflowLogger(ScalarLogger):
"""A logger which stores the status via Mlflow."""
def __init__(
self,
searcher: SearchAlgorithm,
client: Optional[mlflow.tracking.MlflowClient] = None,
run: Union[mlflow.entities.Run, Optional[MlflowID]] = None,
*,
interval: int = 1,
after_first_step: bool = False,
):
"""`__init__(...)`: Initialize the MlflowLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
client: The MlflowClient object whose log_metric() method
will be used for logging. This can be passed as None,
in which case mlflow.log_metrics() will be used instead.
Please note that, if a client is provided, the `run`
argument is required as well.
run: Expected only if a client is provided.
This is the mlflow Run object (an instance of
mlflow.entities.Run), or the ID of the mlflow run.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and
so on. On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31,
and so on.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._client = client
self._run_id: Optional[MlflowID] = None
if self._client is None:
if run is not None:
raise ValueError("Received `run`, but `client` is missing")
else:
if run is None:
raise ValueError("Received `client`, but `run` is missing")
if isinstance(run, mlflow.entities.Run):
self._run_id = run.info.run_id
else:
self._run_id = run
def _log(self, status: dict):
if self._client is None:
mlflow.log_metrics(status, step=self._steps_count)
else:
for k, v in status.items():
self._client.log_metric(self._run_id, k, v, step=self._steps_count)
__init__(self, searcher, client=None, run=None, *, interval=1, after_first_step=False)
special
¶
__init__(...)
: Initialize the MlflowLogger.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
searcher |
SearchAlgorithm |
The evolutionary algorithm instance whose progress is to be logged. |
required |
client |
Optional[mlflow.tracking.client.MlflowClient] |
The MlflowClient object whose log_metric() method
will be used for logging. This can be passed as None,
in which case mlflow.log_metrics() will be used instead.
Please note that, if a client is provided, the |
None |
run |
Union[mlflow.entities.run.Run, str, bytes, int] |
Expected only if a client is provided. This is the mlflow Run object (an instance of mlflow.entities.Run), or the ID of the mlflow run. |
None |
interval |
int |
Expected as an integer n. Logging is to be done at every n iterations. |
1 |
after_first_step |
bool |
Expected as a boolean. Meaningful only if interval is set as an integer greater than 1. Let us suppose that interval is set as 10. If after_first_step is False (which is the default), then the logging will be done at steps 10, 20, 30, and so on. On the other hand, if after_first_step is True, then the logging will be done at steps 1, 11, 21, 31, and so on. |
False |
Source code in evotorch/logging.py
def __init__(
self,
searcher: SearchAlgorithm,
client: Optional[mlflow.tracking.MlflowClient] = None,
run: Union[mlflow.entities.Run, Optional[MlflowID]] = None,
*,
interval: int = 1,
after_first_step: bool = False,
):
"""`__init__(...)`: Initialize the MlflowLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
client: The MlflowClient object whose log_metric() method
will be used for logging. This can be passed as None,
in which case mlflow.log_metrics() will be used instead.
Please note that, if a client is provided, the `run`
argument is required as well.
run: Expected only if a client is provided.
This is the mlflow Run object (an instance of
mlflow.entities.Run), or the ID of the mlflow run.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and
so on. On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31,
and so on.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._client = client
self._run_id: Optional[MlflowID] = None
if self._client is None:
if run is not None:
raise ValueError("Received `run`, but `client` is missing")
else:
if run is None:
raise ValueError("Received `client`, but `run` is missing")
if isinstance(run, mlflow.entities.Run):
self._run_id = run.info.run_id
else:
self._run_id = run
NeptuneLogger (ScalarLogger)
¶
A logger which stores the status via neptune.
Source code in evotorch/logging.py
class NeptuneLogger(ScalarLogger):
"""A logger which stores the status via neptune."""
def __init__(
self,
searcher: SearchAlgorithm,
run,
*,
interval: int = 1,
after_first_step: bool = False,
group: Optional[str] = None,
):
"""`__init__(...)`: Initialize the NeptuneLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
run: A `neptune.new.run.Run` instance using which the status
will be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and so on.
On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and so
on.
group: Into which group will the metrics be stored.
For example, if the status keys to be logged are "score" and
"elapsed", and `group` is set as "training", then the metrics
will be sent to neptune with the keys "training/score" and
"training/elapsed". `group` can also be left as None,
in which case the status will be sent to neptune with the
key names unchanged.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._run = run
self._group = group
def _log(self, status: dict):
for k, v in status.items():
target_key = k if self._group is None else self._group + "/" + k
self._run[target_key].log(v)
__init__(self, searcher, run, *, interval=1, after_first_step=False, group=None)
special
¶
__init__(...)
: Initialize the NeptuneLogger.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
searcher |
SearchAlgorithm |
The evolutionary algorithm instance whose progress is to be logged. |
required |
run |
A |
required | |
interval |
int |
Expected as an integer n. Logging is to be done at every n iterations. |
1 |
after_first_step |
bool |
Expected as a boolean. Meaningful only if interval is set as an integer greater than 1. Let us suppose that interval is set as 10. If after_first_step is False (which is the default), then the logging will be done at steps 10, 20, 30, and so on. On the other hand, if after_first_step is True, then the logging will be done at steps 1, 11, 21, 31, and so on. |
False |
group |
Optional[str] |
Into which group will the metrics be stored.
For example, if the status keys to be logged are "score" and
"elapsed", and |
None |
Source code in evotorch/logging.py
def __init__(
self,
searcher: SearchAlgorithm,
run,
*,
interval: int = 1,
after_first_step: bool = False,
group: Optional[str] = None,
):
"""`__init__(...)`: Initialize the NeptuneLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
run: A `neptune.new.run.Run` instance using which the status
will be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and so on.
On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and so
on.
group: Into which group will the metrics be stored.
For example, if the status keys to be logged are "score" and
"elapsed", and `group` is set as "training", then the metrics
will be sent to neptune with the keys "training/score" and
"training/elapsed". `group` can also be left as None,
in which case the status will be sent to neptune with the
key names unchanged.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._run = run
self._group = group
PandasLogger (ScalarLogger)
¶
A logger which collects status information and generates a pandas.DataFrame at the end.
Source code in evotorch/logging.py
class PandasLogger(ScalarLogger):
"""A logger which collects status information and
generates a pandas.DataFrame at the end.
"""
def __init__(self, searcher: SearchAlgorithm, *, interval: int = 1, after_first_step: bool = False):
"""`__init__(...)`: Initialize the PandasLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and
so on. On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and
so on.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._data = []
def _log(self, status: dict):
self._data.append(deepcopy(status))
def to_dataframe(self, *, index: Optional[str] = "iter") -> pandas.DataFrame:
"""Generate a pandas.DataFrame from the collected
status information.
Args:
index: The column to be set as the index.
If passed as None, then no index will be set.
The default is "iter".
"""
result = pandas.DataFrame(self._data)
if index is not None:
result.set_index(index, inplace=True)
return result
__init__(self, searcher, *, interval=1, after_first_step=False)
special
¶
__init__(...)
: Initialize the PandasLogger.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
searcher |
SearchAlgorithm |
The evolutionary algorithm instance whose progress is to be logged. |
required |
interval |
int |
Expected as an integer n. Logging is to be done at every n iterations. |
1 |
after_first_step |
bool |
Expected as a boolean. Meaningful only if interval is set as an integer greater than 1. Let us suppose that interval is set as 10. If after_first_step is False (which is the default), then the logging will be done at steps 10, 20, 30, and so on. On the other hand, if after_first_step is True, then the logging will be done at steps 1, 11, 21, 31, and so on. |
False |
Source code in evotorch/logging.py
def __init__(self, searcher: SearchAlgorithm, *, interval: int = 1, after_first_step: bool = False):
"""`__init__(...)`: Initialize the PandasLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and
so on. On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and
so on.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._data = []
to_dataframe(self, *, index='iter')
¶
Generate a pandas.DataFrame from the collected status information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index |
Optional[str] |
The column to be set as the index. If passed as None, then no index will be set. The default is "iter". |
'iter' |
Source code in evotorch/logging.py
def to_dataframe(self, *, index: Optional[str] = "iter") -> pandas.DataFrame:
"""Generate a pandas.DataFrame from the collected
status information.
Args:
index: The column to be set as the index.
If passed as None, then no index will be set.
The default is "iter".
"""
result = pandas.DataFrame(self._data)
if index is not None:
result.set_index(index, inplace=True)
return result
SacredLogger (ScalarLogger)
¶
A logger which stores the status via the Run object of sacred.
Source code in evotorch/logging.py
class SacredLogger(ScalarLogger):
"""A logger which stores the status via the Run object of sacred."""
def __init__(
self,
searcher: SearchAlgorithm,
run: ExpOrRun,
result: Optional[str] = None,
*,
interval: int = 1,
after_first_step: bool = False,
):
"""`__init__(...)`: Initialize the SacredLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
run: An instance of `sacred.run.Run` or `sacred.Experiment`,
using which the progress will be logged.
result: The key in the status dictionary whose associated
value will be registered as the current result
of the experiment.
If left as None, no result will be registered.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and
so on. On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31,
and so on.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._result = result
self._run = run
def _log(self, status: dict):
for k, v in status.items():
self._run.log_scalar(k, v)
if self._result is not None:
self._run.result = status[self._result]
__init__(self, searcher, run, result=None, *, interval=1, after_first_step=False)
special
¶
__init__(...)
: Initialize the SacredLogger.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
searcher |
SearchAlgorithm |
The evolutionary algorithm instance whose progress is to be logged. |
required |
run |
Union[sacred.experiment.Experiment, sacred.run.Run] |
An instance of |
required |
result |
Optional[str] |
The key in the status dictionary whose associated value will be registered as the current result of the experiment. If left as None, no result will be registered. |
None |
interval |
int |
Expected as an integer n. Logging is to be done at every n iterations. |
1 |
after_first_step |
bool |
Expected as a boolean. Meaningful only if interval is set as an integer greater than 1. Let us suppose that interval is set as 10. If after_first_step is False (which is the default), then the logging will be done at steps 10, 20, 30, and so on. On the other hand, if after_first_step is True, then the logging will be done at steps 1, 11, 21, 31, and so on. |
False |
Source code in evotorch/logging.py
def __init__(
self,
searcher: SearchAlgorithm,
run: ExpOrRun,
result: Optional[str] = None,
*,
interval: int = 1,
after_first_step: bool = False,
):
"""`__init__(...)`: Initialize the SacredLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
run: An instance of `sacred.run.Run` or `sacred.Experiment`,
using which the progress will be logged.
result: The key in the status dictionary whose associated
value will be registered as the current result
of the experiment.
If left as None, no result will be registered.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and
so on. On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31,
and so on.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._result = result
self._run = run
StdOutLogger (ScalarLogger)
¶
A logger which prints the status into the screen.
Source code in evotorch/logging.py
class StdOutLogger(ScalarLogger):
"""A logger which prints the status into the screen."""
def __init__(
self,
searcher: SearchAlgorithm,
*,
interval: int = 1,
after_first_step: bool = False,
leading_keys: Iterable[str] = ("iter",),
):
"""`__init__(...)`: Initialize the StdOutLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and so on.
On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and so
on.
leading_keys: A sequence of strings where each string is a status
key. When printing the status, these keys will be shown first.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._leading_keys = list(leading_keys)
self._leading_keys_set = set(self._leading_keys)
def _log(self, status: dict):
max_key_length = max([len(str(k)) for k in status.keys()])
def report(k, v):
nonlocal max_key_length
print(str(k).rjust(max_key_length), ":", v)
for k in self._leading_keys:
if k in status:
v = status[k]
report(k, v)
for k, v in status.items():
if k not in self._leading_keys_set:
report(k, v)
print()
__init__(self, searcher, *, interval=1, after_first_step=False, leading_keys=('iter',))
special
¶
__init__(...)
: Initialize the StdOutLogger.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
searcher |
SearchAlgorithm |
The evolutionary algorithm instance whose progress is to be logged. |
required |
interval |
int |
Expected as an integer n. Logging is to be done at every n iterations. |
1 |
after_first_step |
bool |
Expected as a boolean. Meaningful only if interval is set as an integer greater than 1. Let us suppose that interval is set as 10. If after_first_step is False (which is the default), then the logging will be done at steps 10, 20, 30, and so on. On the other hand, if after_first_step is True, then the logging will be done at steps 1, 11, 21, 31, and so on. |
False |
leading_keys |
Iterable[str] |
A sequence of strings where each string is a status key. When printing the status, these keys will be shown first. |
('iter',) |
Source code in evotorch/logging.py
def __init__(
self,
searcher: SearchAlgorithm,
*,
interval: int = 1,
after_first_step: bool = False,
leading_keys: Iterable[str] = ("iter",),
):
"""`__init__(...)`: Initialize the StdOutLogger.
Args:
searcher: The evolutionary algorithm instance whose progress
is to be logged.
interval: Expected as an integer n.
Logging is to be done at every n iterations.
after_first_step: Expected as a boolean.
Meaningful only if interval is set as an integer greater
than 1. Let us suppose that interval is set as 10.
If after_first_step is False (which is the default),
then the logging will be done at steps 10, 20, 30, and so on.
On the other hand, if after_first_step is True,
then the logging will be done at steps 1, 11, 21, 31, and so
on.
leading_keys: A sequence of strings where each string is a status
key. When printing the status, these keys will be shown first.
"""
super().__init__(searcher, interval=interval, after_first_step=after_first_step)
self._leading_keys = list(leading_keys)
self._leading_keys_set = set(self._leading_keys)
neuroevolution
special
¶
Various utilities for neuroevolution
gymne
¶
This namespace contains the GymNE
class.
GymNE (NEProblem)
¶
Representation of a NeuroevolutionProblem where the goal is to maximize
the total reward obtained in a gym
environment.
Source code in evotorch/neuroevolution/gymne.py
class GymNE(NEProblem):
"""
Representation of a NeuroevolutionProblem where the goal is to maximize
the total reward obtained in a `gym` environment.
"""
def __init__(
self,
env_name: str,
network: Union[str, nn.Module, Callable[[], nn.Module]],
*,
network_args: Optional[dict] = None,
env_config: Optional[Mapping] = None,
observation_normalization: bool = False,
num_episodes: int = 1,
episode_length: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
num_actors: Optional[Union[int, str]] = "max",
actor_config: Optional[dict] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
initial_bounds: Optional[BoundsPairLike] = (-0.00001, 0.00001),
):
"""
`__init__(...)`: Initialize the GymNE.
Args:
env_name: Name of the `gym` environment.
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network policy whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
network_args: Optionally a dict-like object, storing keyword
arguments to be passed to the network while instantiating it.
env_config: Keyword arguments to pass to `gym.make(...)` while
creating the `gym` environment.
observation_normalization: Whether or not to do online observation
normalization.
num_episodes: Number of episodes over which a single solution will
be evaluated.
episode_length: Maximum amount of simulator interactions allowed
in a single episode. If left as None, whether or not an episode
is terminated is determined only by the `gym` environment
itself.
decrease_rewards_by: Some gym env.s are defined in such a way that
the agent gets a constant reward for each timestep
it survives. This constant reward can also be called
"survival bonus". Such a rewarding scheme can lead the
evolution to local optima where the agent does nothing
but does not die either, just to collect the survival
bonuses. To prevent this, it can be desired to
remove the survival bonuses from each reward obtained.
If this is the case with the problem at hand,
the user can set the argument `decrease_rewards_by`
to a positive float number, and that number will
be subtracted from each reward.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
One can also set this as "max", which means that
an actor will be created on each available CPU.
When the parallelization is enabled each actor will have its
own instance of the `gym` environment.
In the case of `GymNE`, the default value for this argument
is "max", which means there will be full parallelization,
utilizing all the available CPUs.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
initial_bounds: Specifies an interval from which the values of the
initial policy parameters will be drawn.
"""
# Store various environment information
self._env_name = env_name
self._env_config = {} if env_config is None else deepcopy(dict(env_config))
self._decrease_rewards_by = 0.0 if decrease_rewards_by is None else float(decrease_rewards_by)
self._observation_normalization = bool(observation_normalization)
self._num_episodes = int(num_episodes)
self._episode_length = None if episode_length is None else int(episode_length)
self._info_keys = dict(cumulative_reward="avg", interaction_count="sum")
self._env: Optional[gym.Env] = None
self._obs_stats: Optional[RunningStat] = None
self._collected_stats: Optional[RunningStat] = None
# Create a temporary environment to read its dimensions
tmp_env = gym.make(self._env_name, **(self._env_config))
# Store the temporary environment's dimensions
self._obs_length = len(tmp_env.observation_space.low)
if isinstance(tmp_env.action_space, gym.spaces.Discrete):
self._act_length = tmp_env.action_space.n
else:
self._act_length = len(tmp_env.action_space.low)
self._obs_shape = tmp_env.observation_space.low.shape
# Validate the space types of the environment
ensure_space_types(tmp_env)
if self._observation_normalization:
self._obs_stats = RunningStat()
self._collected_stats = RunningStat()
else:
self._obs_stats = None
self._collected_stats = None
self._interaction_count: int = 0
self._episode_count: int = 0
super().__init__(
objective_sense="max", # RL is maximization
network=network, # Using the policy as the network
network_args=network_args,
initial_bounds=initial_bounds,
num_actors=num_actors,
actor_config=actor_config,
subbatch_size=subbatch_size,
device="cpu",
)
self.after_eval_hook.append(self._extra_status)
@property
def _network_constants(self) -> dict:
return {"obs_length": self._obs_length, "act_length": self._act_length, "obs_space": self._obs_shape}
def _get_env(self) -> gym.Env:
if self._env is None:
self._env = gym.make(self._env_name, **(self._env_config))
return self._env
def _normalize_observation(self, observation: Iterable, *, update_stats: bool = True) -> Iterable:
observation = np.asarray(observation, dtype="float32")
if self.observation_normalization:
if update_stats:
self._obs_stats.update(observation)
self._collected_stats.update(observation)
return self._obs_stats.normalize(observation)
else:
return observation
def _use_policy(self, observation: Iterable, policy: nn.Module) -> Iterable:
with torch.no_grad():
result = policy(torch.as_tensor(observation, dtype=torch.float32, device="cpu")).numpy()
env = self._get_env()
if isinstance(env.action_space, gym.spaces.Discrete):
result = np.argmax(result)
elif isinstance(env.action_space, gym.spaces.Box):
result = np.clip(result, env.action_space.low, env.action_space.high)
return result
def _prepare(self) -> None:
super()._prepare()
self._get_env()
def _rollout(
self,
*,
policy: nn.Module,
update_stats: bool = True,
visualize: bool = False,
decrease_rewards_by: Optional[float] = None,
) -> dict:
"""Peform a rollout of a network"""
if decrease_rewards_by is None:
decrease_rewards_by = self._decrease_rewards_by
else:
decrease_rewards_by = float(decrease_rewards_by)
reset_module_state(policy)
env = self._get_env()
observation = self._normalize_observation(reset_env(env), update_stats=update_stats)
if visualize:
env.render()
t = 0
cumulative_reward = 0.0
while True:
observation, raw_reward, done, info = take_step_in_env(env, self._use_policy(observation, policy))
reward = raw_reward - decrease_rewards_by
t += 1
if update_stats:
self._interaction_count += 1
if visualize:
env.render()
observation = self._normalize_observation(observation, update_stats=update_stats)
cumulative_reward += reward
if done or ((self._episode_length is not None) and (t >= self._episode_length)):
if update_stats:
self._episode_count += 1
final_info = dict(cumulative_reward=cumulative_reward, interaction_count=t)
for k in self._info_keys:
if k not in final_info:
final_info[k] = info[k]
return final_info
@property
def _nonserialized_attribs(self) -> List[str]:
return super()._nonserialized_attribs + ["_env"]
def run(
self,
policy: nn.Module,
*,
update_stats: bool = False,
visualize: bool = False,
num_episodes: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
) -> dict:
"""Evaluate the policy parameters on the gym environment."""
if num_episodes is None:
num_episodes = self._num_episodes
try:
policy.eval()
episode_results = [
self._rollout(
policy=policy,
update_stats=update_stats,
visualize=visualize,
decrease_rewards_by=decrease_rewards_by,
)
for _ in range(num_episodes)
]
results = _accumulate_all_across_dicts(episode_results, self._info_keys)
return results
finally:
policy.train()
def visualize(
self,
policy: nn.Module,
*,
update_stats: bool = False,
num_episodes: Optional[int] = 1,
decrease_rewards_by: Optional[float] = None,
) -> dict:
return self.run(
policy=policy,
update_stats=update_stats,
visualize=True,
num_episodes=num_episodes,
decrease_rewards_by=decrease_rewards_by,
)
def _ensure_obsnorm(self):
if not self.observation_normalization:
raise ValueError("This feature can only be used when observation_normalization=True.")
def get_observation_stats(self) -> RunningStat:
"""Get the observation stats"""
self._ensure_obsnorm()
return self._obs_stats
def _make_sync_data_for_actors(self) -> Any:
if self.observation_normalization:
return dict(obs_stats=self.get_observation_stats())
else:
return None
def set_observation_stats(self, rs: RunningStat):
"""Set the observation stats"""
self._ensure_obsnorm()
self._obs_stats.reset()
self._obs_stats.update(rs)
def _use_sync_data_from_main(self, received: dict):
for k, v in received.items():
if k == "obs_stats":
self.set_observation_stats(v)
def pop_observation_stats(self) -> RunningStat:
"""Get and clear the collected observation stats"""
self._ensure_obsnorm()
result = self._collected_stats
self._collected_stats = RunningStat()
return result
def _make_sync_data_for_main(self) -> Any:
result = dict(episode_count=self.episode_count, interaction_count=self.interaction_count)
if self.observation_normalization:
result["obs_stats_delta"] = self.pop_observation_stats()
return result
def update_observation_stats(self, rs: RunningStat):
"""Update the observation stats via another RunningStat instance"""
self._ensure_obsnorm()
self._obs_stats.update(rs)
def _use_sync_data_from_actors(self, received: list):
total_episode_count = 0
total_interaction_count = 0
for data in received:
data: dict
total_episode_count += data["episode_count"]
total_interaction_count += data["interaction_count"]
if self.observation_normalization:
self.update_observation_stats(data["obs_stats_delta"])
self.set_episode_count(total_episode_count)
self.set_interaction_count(total_interaction_count)
def _make_pickle_data_for_main(self) -> dict:
# For when the main Problem object (the non-remote one) gets pickled,
# this function returns the counters of this remote Problem instance,
# to be sent to the main one.
return dict(interaction_count=self.interaction_count, episode_count=self.episode_count)
def _use_pickle_data_from_main(self, state: dict):
# For when a newly unpickled Problem object gets (re)parallelized,
# this function restores the inner states specific to this remote
# worker. In the case of GymNE, those inner states are episode
# and interaction counters.
for k, v in state.items():
if k == "episode_count":
self.set_episode_count(v)
elif k == "interaction_count":
self.set_interaction_count(v)
else:
raise ValueError(f"When restoring the inner state of a remote worker, unrecognized state key: {k}")
def _extra_status(self, batch: SolutionBatch):
return dict(total_interaction_count=self.interaction_count, total_episode_count=self.episode_count)
@property
def observation_normalization(self) -> bool:
"""
Get whether or not observation normalization is enabled.
"""
return self._observation_normalization
def set_episode_count(self, n: int):
"""
Set the episode count manually.
"""
self._episode_count = int(n)
def set_interaction_count(self, n: int):
"""
Set the interaction count manually.
"""
self._interaction_count = int(n)
@property
def interaction_count(self) -> int:
"""
Get the total number of simulator interactions made.
"""
return self._interaction_count
@property
def episode_count(self) -> int:
"""
Get the total number of episodes completed.
"""
return self._episode_count
def _get_local_episode_count(self) -> int:
return self.episode_count
def _get_local_interaction_count(self) -> int:
return self.interaction_count
def _evaluate_network(self, policy: nn.Module) -> Union[float, torch.Tensor]:
result = self.run(
policy,
update_stats=True,
visualize=False,
num_episodes=self._num_episodes,
decrease_rewards_by=self._decrease_rewards_by,
)
return result["cumulative_reward"]
def to_policy(self, x: Iterable, *, trainable_stats: bool = False, clip_actions: bool = True) -> nn.Module:
"""
Convert the given parameter vector to a policy as a PyTorch module.
If the problem is configured to have observation normalization,
the PyTorch module also contains an additional normalization layer.
Args:
x: An sequence of real numbers, containing the parameters
of a policy. Can be a PyTorch tensor, a numpy array,
or a SolutionVector.
trainable_stats: Whether or not the observation stats within
the observation normalization layer are to be stored as
trainable parameters.
clip_actions: Whether or not to add an action clipping layer so
that the generated actions will always be within an
acceptable range for the environment.
Returns:
The policy expressed by the parameters.
"""
policy = [self.make_net(x)]
if self.observation_normalization:
policy.insert(0, ObsNormLayer(self._obs_stats, trainable_stats=trainable_stats))
if clip_actions and isinstance(self._get_env().action_space, gym.spaces.Box):
policy.append(ActClipLayer(self._get_env().action_space))
if len(policy) == 1:
return policy[0]
else:
return nn.Sequential(*policy)
def get_env(self) -> gym.Env:
"""
Get the gym environment stored by this GymNE instance
"""
return self._get_env()
episode_count: int
property
readonly
¶
Get the total number of episodes completed.
interaction_count: int
property
readonly
¶
Get the total number of simulator interactions made.
observation_normalization: bool
property
readonly
¶
Get whether or not observation normalization is enabled.
__init__(self, env_name, network, *, network_args=None, env_config=None, observation_normalization=False, num_episodes=1, episode_length=None, decrease_rewards_by=None, num_actors='max', actor_config=None, num_subbatches=None, subbatch_size=None, initial_bounds=(-1e-05, 1e-05))
special
¶
__init__(...)
: Initialize the GymNE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
env_name |
str |
Name of the |
required |
network |
Union[str, torch.nn.modules.module.Module, Callable[[], torch.nn.modules.module.Module]] |
A network structure string, or a Callable (which can be
a class inheriting from |
required |
network_args |
Optional[dict] |
Optionally a dict-like object, storing keyword arguments to be passed to the network while instantiating it. |
None |
env_config |
Optional[collections.abc.Mapping] |
Keyword arguments to pass to |
None |
observation_normalization |
bool |
Whether or not to do online observation normalization. |
False |
num_episodes |
int |
Number of episodes over which a single solution will be evaluated. |
1 |
episode_length |
Optional[int] |
Maximum amount of simulator interactions allowed
in a single episode. If left as None, whether or not an episode
is terminated is determined only by the |
None |
decrease_rewards_by |
Optional[float] |
Some gym env.s are defined in such a way that
the agent gets a constant reward for each timestep
it survives. This constant reward can also be called
"survival bonus". Such a rewarding scheme can lead the
evolution to local optima where the agent does nothing
but does not die either, just to collect the survival
bonuses. To prevent this, it can be desired to
remove the survival bonuses from each reward obtained.
If this is the case with the problem at hand,
the user can set the argument |
None |
num_actors |
Union[int, str] |
Number of actors to create for parallelized
evaluation of the solutions.
One can also set this as "max", which means that
an actor will be created on each available CPU.
When the parallelization is enabled each actor will have its
own instance of the |
'max' |
actor_config |
Optional[dict] |
A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass |
None |
num_subbatches |
Optional[int] |
If |
None |
subbatch_size |
Optional[int] |
If |
None |
initial_bounds |
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] |
Specifies an interval from which the values of the initial policy parameters will be drawn. |
(-1e-05, 1e-05) |
Source code in evotorch/neuroevolution/gymne.py
def __init__(
self,
env_name: str,
network: Union[str, nn.Module, Callable[[], nn.Module]],
*,
network_args: Optional[dict] = None,
env_config: Optional[Mapping] = None,
observation_normalization: bool = False,
num_episodes: int = 1,
episode_length: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
num_actors: Optional[Union[int, str]] = "max",
actor_config: Optional[dict] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
initial_bounds: Optional[BoundsPairLike] = (-0.00001, 0.00001),
):
"""
`__init__(...)`: Initialize the GymNE.
Args:
env_name: Name of the `gym` environment.
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network policy whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
network_args: Optionally a dict-like object, storing keyword
arguments to be passed to the network while instantiating it.
env_config: Keyword arguments to pass to `gym.make(...)` while
creating the `gym` environment.
observation_normalization: Whether or not to do online observation
normalization.
num_episodes: Number of episodes over which a single solution will
be evaluated.
episode_length: Maximum amount of simulator interactions allowed
in a single episode. If left as None, whether or not an episode
is terminated is determined only by the `gym` environment
itself.
decrease_rewards_by: Some gym env.s are defined in such a way that
the agent gets a constant reward for each timestep
it survives. This constant reward can also be called
"survival bonus". Such a rewarding scheme can lead the
evolution to local optima where the agent does nothing
but does not die either, just to collect the survival
bonuses. To prevent this, it can be desired to
remove the survival bonuses from each reward obtained.
If this is the case with the problem at hand,
the user can set the argument `decrease_rewards_by`
to a positive float number, and that number will
be subtracted from each reward.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
One can also set this as "max", which means that
an actor will be created on each available CPU.
When the parallelization is enabled each actor will have its
own instance of the `gym` environment.
In the case of `GymNE`, the default value for this argument
is "max", which means there will be full parallelization,
utilizing all the available CPUs.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
initial_bounds: Specifies an interval from which the values of the
initial policy parameters will be drawn.
"""
# Store various environment information
self._env_name = env_name
self._env_config = {} if env_config is None else deepcopy(dict(env_config))
self._decrease_rewards_by = 0.0 if decrease_rewards_by is None else float(decrease_rewards_by)
self._observation_normalization = bool(observation_normalization)
self._num_episodes = int(num_episodes)
self._episode_length = None if episode_length is None else int(episode_length)
self._info_keys = dict(cumulative_reward="avg", interaction_count="sum")
self._env: Optional[gym.Env] = None
self._obs_stats: Optional[RunningStat] = None
self._collected_stats: Optional[RunningStat] = None
# Create a temporary environment to read its dimensions
tmp_env = gym.make(self._env_name, **(self._env_config))
# Store the temporary environment's dimensions
self._obs_length = len(tmp_env.observation_space.low)
if isinstance(tmp_env.action_space, gym.spaces.Discrete):
self._act_length = tmp_env.action_space.n
else:
self._act_length = len(tmp_env.action_space.low)
self._obs_shape = tmp_env.observation_space.low.shape
# Validate the space types of the environment
ensure_space_types(tmp_env)
if self._observation_normalization:
self._obs_stats = RunningStat()
self._collected_stats = RunningStat()
else:
self._obs_stats = None
self._collected_stats = None
self._interaction_count: int = 0
self._episode_count: int = 0
super().__init__(
objective_sense="max", # RL is maximization
network=network, # Using the policy as the network
network_args=network_args,
initial_bounds=initial_bounds,
num_actors=num_actors,
actor_config=actor_config,
subbatch_size=subbatch_size,
device="cpu",
)
self.after_eval_hook.append(self._extra_status)
get_env(self)
¶
get_observation_stats(self)
¶
pop_observation_stats(self)
¶
run(self, policy, *, update_stats=False, visualize=False, num_episodes=None, decrease_rewards_by=None)
¶
Evaluate the policy parameters on the gym environment.
Source code in evotorch/neuroevolution/gymne.py
def run(
self,
policy: nn.Module,
*,
update_stats: bool = False,
visualize: bool = False,
num_episodes: Optional[int] = None,
decrease_rewards_by: Optional[float] = None,
) -> dict:
"""Evaluate the policy parameters on the gym environment."""
if num_episodes is None:
num_episodes = self._num_episodes
try:
policy.eval()
episode_results = [
self._rollout(
policy=policy,
update_stats=update_stats,
visualize=visualize,
decrease_rewards_by=decrease_rewards_by,
)
for _ in range(num_episodes)
]
results = _accumulate_all_across_dicts(episode_results, self._info_keys)
return results
finally:
policy.train()
set_episode_count(self, n)
¶
set_interaction_count(self, n)
¶
set_observation_stats(self, rs)
¶
to_policy(self, x, *, trainable_stats=False, clip_actions=True)
¶
Convert the given parameter vector to a policy as a PyTorch module.
If the problem is configured to have observation normalization, the PyTorch module also contains an additional normalization layer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Iterable |
An sequence of real numbers, containing the parameters of a policy. Can be a PyTorch tensor, a numpy array, or a SolutionVector. |
required |
trainable_stats |
bool |
Whether or not the observation stats within the observation normalization layer are to be stored as trainable parameters. |
False |
clip_actions |
bool |
Whether or not to add an action clipping layer so that the generated actions will always be within an acceptable range for the environment. |
True |
Returns:
Type | Description |
---|---|
Module |
The policy expressed by the parameters. |
Source code in evotorch/neuroevolution/gymne.py
def to_policy(self, x: Iterable, *, trainable_stats: bool = False, clip_actions: bool = True) -> nn.Module:
"""
Convert the given parameter vector to a policy as a PyTorch module.
If the problem is configured to have observation normalization,
the PyTorch module also contains an additional normalization layer.
Args:
x: An sequence of real numbers, containing the parameters
of a policy. Can be a PyTorch tensor, a numpy array,
or a SolutionVector.
trainable_stats: Whether or not the observation stats within
the observation normalization layer are to be stored as
trainable parameters.
clip_actions: Whether or not to add an action clipping layer so
that the generated actions will always be within an
acceptable range for the environment.
Returns:
The policy expressed by the parameters.
"""
policy = [self.make_net(x)]
if self.observation_normalization:
policy.insert(0, ObsNormLayer(self._obs_stats, trainable_stats=trainable_stats))
if clip_actions and isinstance(self._get_env().action_space, gym.spaces.Box):
policy.append(ActClipLayer(self._get_env().action_space))
if len(policy) == 1:
return policy[0]
else:
return nn.Sequential(*policy)
update_observation_stats(self, rs)
¶
neproblem
¶
This namespace contains the NeuroevolutionProblem
class.
NEProblem (Problem)
¶
Base class for neuro-evolution problems where the goal is to optimize the parameters of a neural network represented as a PyTorch module.
Any problem inheriting from this class is expected to override the method
_evaluate_network(self, net: torch.nn.Module) -> Union[torch.Tensor, float]
where net
is the neural network to be evaluated, and the return value
is a scalar or a vector (for multi-objective cases) expressing the
fitness value(s).
Alternatively, this class can be directly instantiated in the following way:
def f(module: MyTorchModuleClass) -> Union[float, torch.Tensor, tuple]:
# Evaluate the given PyTorch module here
fitness = ...
return fitness
problem = NEProblem(
"min", MyTorchModuleClass, f,
...
)
which specifies that the problem's goal is to minimize the return of the
function f
.
For multi-objective cases, the fitness returned by f
is expected as a
1-dimensional tensor. For when the problem has additional evaluation data,
a two-element tuple can be returned by f
instead, where the first
element is the fitness value(s) and the second element is a 1-dimensional
tensor storing the additional data.
Source code in evotorch/neuroevolution/neproblem.py
class NEProblem(Problem):
"""
Base class for neuro-evolution problems where the goal is to optimize the
parameters of a neural network represented as a PyTorch module.
Any problem inheriting from this class is expected to override the method
`_evaluate_network(self, net: torch.nn.Module) -> Union[torch.Tensor, float]`
where `net` is the neural network to be evaluated, and the return value
is a scalar or a vector (for multi-objective cases) expressing the
fitness value(s).
Alternatively, this class can be directly instantiated in the following
way:
```python
def f(module: MyTorchModuleClass) -> Union[float, torch.Tensor, tuple]:
# Evaluate the given PyTorch module here
fitness = ...
return fitness
problem = NEProblem(
"min", MyTorchModuleClass, f,
...
)
```
which specifies that the problem's goal is to minimize the return of the
function `f`.
For multi-objective cases, the fitness returned by `f` is expected as a
1-dimensional tensor. For when the problem has additional evaluation data,
a two-element tuple can be returned by `f` instead, where the first
element is the fitness value(s) and the second element is a 1-dimensional
tensor storing the additional data.
"""
def __init__(
self,
objective_sense: ObjectiveSense,
network: Union[str, nn.Module, Callable[[], nn.Module]],
network_eval_func: Optional[Callable] = None,
*,
network_args: Optional[dict] = None,
initial_bounds: Optional[BoundsPairLike] = (-0.00001, 0.00001),
eval_dtype: Optional[DType] = None,
eval_data_length: int = 0,
seed: Optional[int] = None,
num_actors: Optional[Union[int, str]] = "num_devices",
actor_config: Optional[dict] = None,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
device: Optional[Device] = None,
):
"""
`__init__(...)`: Initialize the NEProblem.
Args:
objective_sense: The objective sense, expected as "min" or "max"
for single-objective cases, or as a sequence of strings
(each string being "min" or "max") for multi-objective cases.
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
network_eval_func: Optionally a function (or any Callable object)
which receives a PyTorch module as its argument, and returns
either a fitness, or a two-element tuple containing the fitness
and the additional evaluation data. The fitness can be a scalar
(for single-objective cases) or a 1-dimensional tensor (for
multi-objective cases). The additional evaluation data is
expected as a 1-dimensional tensor.
If this argument is left as None, it will be expected that
the method `_evaluate_network(...)` is overriden by the
inheriting class.
network_args: Optionally a dict-like object, storing keyword
arguments to be passed to the network while instantiating it.
initial_bounds: Specifies an interval from which the values of the
initial neural network parameters will be drawn.
eval_dtype: dtype to be used for fitnesses. If not specified, then
`eval_dtype` will be inferred from the dtype of the parameters
of the neural network.
In more details, if the neural network's parameters have a
float dtype, `eval_dtype` will be a compatible float.
Otherwise, it will be "float32".
eval_data_length: Length of the extra evaluation data.
seed: Random number seed. If left as None, this NEProblem instance
will not have its own random generator, and the global random
generator of PyTorch will be used instead.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many sub-batches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a sub-batch (or sub-population) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
device: Default device in which a new population will be generated
and the neural networks will operate.
If not specified, "cpu" will be used.
"""
# Set the main device of the problem
# Although the operation of setting the main device is done by the main Problem class,
# here we need this at an earlier stage.
if device is None:
device = "cpu"
self._device = torch.device(device)
# Set the network
self._original_network = network
self._network_args = {} if network_args is None else deepcopy(network_args)
if isinstance(self._original_network, nn.Module):
self._original_network = self._original_network.cpu()
# Store the function that will evaluate the network, if available
self._network_eval_func: Optional[Callable] = network_eval_func
self.instantiated_network: nn.Module = None
# Create temporary network
temp_network = self._instantiate_net(self._original_network, device="cpu")
super().__init__(
objective_sense=objective_sense,
initial_bounds=initial_bounds,
bounds=None, # Neuroevolution is an unbounded problem
solution_length=count_parameters(temp_network), # The solution length is inherited from the network passed
dtype=next(temp_network.parameters()).dtype, # The datatype is inherited from the network passed
eval_dtype=eval_dtype,
device=device,
eval_data_length=eval_data_length,
seed=seed,
num_actors=num_actors,
num_gpus_per_actor=num_gpus_per_actor,
actor_config=actor_config,
num_subbatches=num_subbatches,
subbatch_size=subbatch_size,
store_solution_stats=None,
)
@property
def network_device(self) -> Device:
"""The device on which the problem should place data e.g. the network"""
cpu_device = torch.device("cpu")
if self.is_main:
# This is the case where this is the main process (not a remote actor)
if self.device == cpu_device:
# If the main device of the problem is "cpu", then we assume that the network is going to be on the cpu as well
return cpu_device
else:
# If the main device of the problem is some other device, then it is that device into which the network will be put
return self.device
else:
# If this is a remote actor, then the network will be put into the auxiliary device allocated for that actor
return self.aux_device
@property
def _network_constants(self) -> dict:
"""Named constants which can be passed to the network instantiation e.g. input/output dimension. To be overridden by the user for custom fixed constants for a problem"""
return {}
def network_constants(self) -> dict:
"""Named constants which can be passed to the network instantiation e.g. input/output dimension"""
constants = {}
constants.update(self._network_constants)
constants.update(self._network_args)
return constants
@property
def _nonserialized_attribs(self) -> List[str]:
return ["instantiated_network"]
def _instantiate_net(self, network: Union[str, nn.Module, dict], device: Optional[Device] = None) -> nn.Module:
"""Instantiate the network on the target device, to be overridden by the user for custom behaviour
Returns:
instantiated_network (nn.Module): The network instantiated on the target device
"""
# Branching point determines instantiation of network
if isinstance(network, str):
# Passed argument was a string representation of a torch module
instantiated_network = str_to_net(network, **self.network_constants())
elif isinstance(network, nn.Module):
# Passed argument was directly a torch module
instantiated_network = network
else:
# Passed argument was callable yielding network
instantiated_network = network(**self.network_constants())
# Map to device
device = self.network_device if device is None else device
instantiated_network = instantiated_network.to(device)
return instantiated_network
def _prepare(self) -> None:
"""Instantiate the network on the target device, if not already done"""
self.instantiated_network = self._instantiate_net(self._original_network)
# Clear reference to original network
self._original_network = None
def make_net(self, parameters: Iterable) -> nn.Module:
"""
Make a new network filled with the provided parameters.
Args:
parameters: Parameters to be used as weights within the network.
Can be a Solution, or any 1-dimensional Iterable that can be
converted to a PyTorch tensor.
Returns:
A new network, as a `torch.Module` instance.
"""
if isinstance(parameters, Solution):
parameters = parameters.access_values(keep_evals=True)
else:
parameters = self.as_tensor(parameters)
with torch.no_grad():
net = deepcopy(self.parameterize_net(parameters))
return net
def parameterize_net(self, parameters: torch.Tensor) -> nn.Module:
"""Parameterize the network with a given set of parameters.
Args:
parameters (torch.Tensor): The parameters with which to instantiate the network
Returns:
instantiated_network (nn.Module): The network instantiated with the parameters
"""
# Check if network exists
if self.instantiated_network is None:
self.instantiated_network = self._instantiate_net(self._original_network)
network = self.instantiated_network
# Move the parameters if needed
if parameters.device != self.network_device:
parameters = parameters.to(self.network_device)
# Fill the network with the parameters
fill_parameters(network, parameters)
# Return the network
return network
@property
def _grad_device(self) -> Device:
"""
Get the device in which new solutions will be made in distributed mode.
In more details, in distributed mode, each actor creates its own
sub-populations, evaluates them, and computes its own gradient
(all such actor gradients eventually being collected by the
distribution-based search algorithm in the main process).
For some problem types, it can make sense for the remote actors to
create their temporary sub-populations on another device
(e.g. on the GPU that is allocated specifically for them).
For such situations, one is encouraged to override this property
and make it return whatever device is to be used.
In the case of NEProblem, this property returns whatever device
is specified by the property `network_device`.
"""
return self.network_device
def _evaluate_network(self, network: nn.Module) -> Union[float, torch.Tensor, tuple]:
"""
Evaluate a network and return the evaluation result(s).
In the case where the `__init__` of `NEProblem` was not given
a network evaluator function (via the argument `network_eval_func`),
it will be expected that the inheriting class overrides this
method and defines how a network should be evaluated.
Args:
network (nn.Module): The network to evaluate
Returns:
fitness: The networks' fitness value(s), as a scalar for
single-objective cases, or as a 1-dimensional tensor
for multi-objective cases. The returned value can also
be a two-element tuple where the first element is the
fitness (as a scalar or as a vector) and the second
element is a 1-dimensional vector storing the extra
evaluation data.
"""
raise NotImplementedError
def _evaluate(self, solution: Solution):
"""
Evaluate a single solution.
This is achieved by parameterising the problem's attribute
named `instantiated_network`, and then evaluating the network
with the method `_evaluate_network(...)`.
Args:
solution (Solution): The solution to evaluate.
"""
parameters = solution.values
if self._network_eval_func is None:
evaluator = self._evaluate_network
else:
evaluator = self._network_eval_func
fitnesses = evaluator(self.parameterize_net(parameters))
if isinstance(fitnesses, tuple):
solution.set_evals(*fitnesses)
else:
solution.set_evals(fitnesses)
network_device: Union[str, torch.device]
property
readonly
¶
The device on which the problem should place data e.g. the network
__init__(self, objective_sense, network, network_eval_func=None, *, network_args=None, initial_bounds=(-1e-05, 1e-05), eval_dtype=None, eval_data_length=0, seed=None, num_actors='num_devices', actor_config=None, num_gpus_per_actor=None, num_subbatches=None, subbatch_size=None, device=None)
special
¶
__init__(...)
: Initialize the NEProblem.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
objective_sense |
Union[str, Iterable[str]] |
The objective sense, expected as "min" or "max" for single-objective cases, or as a sequence of strings (each string being "min" or "max") for multi-objective cases. |
required |
network |
Union[str, torch.nn.modules.module.Module, Callable[[], torch.nn.modules.module.Module]] |
A network structure string, or a Callable (which can be
a class inheriting from |
required |
network_eval_func |
Optional[Callable] |
Optionally a function (or any Callable object)
which receives a PyTorch module as its argument, and returns
either a fitness, or a two-element tuple containing the fitness
and the additional evaluation data. The fitness can be a scalar
(for single-objective cases) or a 1-dimensional tensor (for
multi-objective cases). The additional evaluation data is
expected as a 1-dimensional tensor.
If this argument is left as None, it will be expected that
the method |
None |
network_args |
Optional[dict] |
Optionally a dict-like object, storing keyword arguments to be passed to the network while instantiating it. |
None |
initial_bounds |
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] |
Specifies an interval from which the values of the initial neural network parameters will be drawn. |
(-1e-05, 1e-05) |
eval_dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
dtype to be used for fitnesses. If not specified, then
|
None |
eval_data_length |
int |
Length of the extra evaluation data. |
0 |
seed |
Optional[int] |
Random number seed. If left as None, this NEProblem instance will not have its own random generator, and the global random generator of PyTorch will be used instead. |
None |
num_actors |
Union[int, str] |
Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If |
'num_devices' |
actor_config |
Optional[dict] |
A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass |
None |
num_gpus_per_actor |
Union[int, float, str] |
Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number |
None |
num_subbatches |
Optional[int] |
If |
None |
subbatch_size |
Optional[int] |
If |
None |
device |
Union[str, torch.device] |
Default device in which a new population will be generated and the neural networks will operate. If not specified, "cpu" will be used. |
None |
Source code in evotorch/neuroevolution/neproblem.py
def __init__(
self,
objective_sense: ObjectiveSense,
network: Union[str, nn.Module, Callable[[], nn.Module]],
network_eval_func: Optional[Callable] = None,
*,
network_args: Optional[dict] = None,
initial_bounds: Optional[BoundsPairLike] = (-0.00001, 0.00001),
eval_dtype: Optional[DType] = None,
eval_data_length: int = 0,
seed: Optional[int] = None,
num_actors: Optional[Union[int, str]] = "num_devices",
actor_config: Optional[dict] = None,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
device: Optional[Device] = None,
):
"""
`__init__(...)`: Initialize the NEProblem.
Args:
objective_sense: The objective sense, expected as "min" or "max"
for single-objective cases, or as a sequence of strings
(each string being "min" or "max") for multi-objective cases.
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
network_eval_func: Optionally a function (or any Callable object)
which receives a PyTorch module as its argument, and returns
either a fitness, or a two-element tuple containing the fitness
and the additional evaluation data. The fitness can be a scalar
(for single-objective cases) or a 1-dimensional tensor (for
multi-objective cases). The additional evaluation data is
expected as a 1-dimensional tensor.
If this argument is left as None, it will be expected that
the method `_evaluate_network(...)` is overriden by the
inheriting class.
network_args: Optionally a dict-like object, storing keyword
arguments to be passed to the network while instantiating it.
initial_bounds: Specifies an interval from which the values of the
initial neural network parameters will be drawn.
eval_dtype: dtype to be used for fitnesses. If not specified, then
`eval_dtype` will be inferred from the dtype of the parameters
of the neural network.
In more details, if the neural network's parameters have a
float dtype, `eval_dtype` will be a compatible float.
Otherwise, it will be "float32".
eval_data_length: Length of the extra evaluation data.
seed: Random number seed. If left as None, this NEProblem instance
will not have its own random generator, and the global random
generator of PyTorch will be used instead.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many sub-batches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a sub-batch (or sub-population) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
device: Default device in which a new population will be generated
and the neural networks will operate.
If not specified, "cpu" will be used.
"""
# Set the main device of the problem
# Although the operation of setting the main device is done by the main Problem class,
# here we need this at an earlier stage.
if device is None:
device = "cpu"
self._device = torch.device(device)
# Set the network
self._original_network = network
self._network_args = {} if network_args is None else deepcopy(network_args)
if isinstance(self._original_network, nn.Module):
self._original_network = self._original_network.cpu()
# Store the function that will evaluate the network, if available
self._network_eval_func: Optional[Callable] = network_eval_func
self.instantiated_network: nn.Module = None
# Create temporary network
temp_network = self._instantiate_net(self._original_network, device="cpu")
super().__init__(
objective_sense=objective_sense,
initial_bounds=initial_bounds,
bounds=None, # Neuroevolution is an unbounded problem
solution_length=count_parameters(temp_network), # The solution length is inherited from the network passed
dtype=next(temp_network.parameters()).dtype, # The datatype is inherited from the network passed
eval_dtype=eval_dtype,
device=device,
eval_data_length=eval_data_length,
seed=seed,
num_actors=num_actors,
num_gpus_per_actor=num_gpus_per_actor,
actor_config=actor_config,
num_subbatches=num_subbatches,
subbatch_size=subbatch_size,
store_solution_stats=None,
)
make_net(self, parameters)
¶
Make a new network filled with the provided parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
parameters |
Iterable |
Parameters to be used as weights within the network. Can be a Solution, or any 1-dimensional Iterable that can be converted to a PyTorch tensor. |
required |
Returns:
Type | Description |
---|---|
Module |
A new network, as a |
Source code in evotorch/neuroevolution/neproblem.py
def make_net(self, parameters: Iterable) -> nn.Module:
"""
Make a new network filled with the provided parameters.
Args:
parameters: Parameters to be used as weights within the network.
Can be a Solution, or any 1-dimensional Iterable that can be
converted to a PyTorch tensor.
Returns:
A new network, as a `torch.Module` instance.
"""
if isinstance(parameters, Solution):
parameters = parameters.access_values(keep_evals=True)
else:
parameters = self.as_tensor(parameters)
with torch.no_grad():
net = deepcopy(self.parameterize_net(parameters))
return net
network_constants(self)
¶
Named constants which can be passed to the network instantiation e.g. input/output dimension
parameterize_net(self, parameters)
¶
Parameterize the network with a given set of parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
parameters |
torch.Tensor |
The parameters with which to instantiate the network |
required |
Returns:
Type | Description |
---|---|
instantiated_network (nn.Module) |
The network instantiated with the parameters |
Source code in evotorch/neuroevolution/neproblem.py
def parameterize_net(self, parameters: torch.Tensor) -> nn.Module:
"""Parameterize the network with a given set of parameters.
Args:
parameters (torch.Tensor): The parameters with which to instantiate the network
Returns:
instantiated_network (nn.Module): The network instantiated with the parameters
"""
# Check if network exists
if self.instantiated_network is None:
self.instantiated_network = self._instantiate_net(self._original_network)
network = self.instantiated_network
# Move the parameters if needed
if parameters.device != self.network_device:
parameters = parameters.to(self.network_device)
# Fill the network with the parameters
fill_parameters(network, parameters)
# Return the network
return network
net
special
¶
Utility classes and functions for neural networks
layers
¶
Various neural network layer types
Apply (Module)
¶
A torch module for applying an arithmetic operator on an input tensor
Source code in evotorch/neuroevolution/net/layers.py
class Apply(nn.Module):
"""A torch module for applying an arithmetic operator on an input tensor"""
def __init__(self, operator: str, argument: float):
"""`__init__(...)`: Initialize the Apply module.
Args:
operator: Must be '+', '-', '*', '/', or '**'.
Indicates which operation will be done
on the input tensor.
argument: Expected as a float, represents
the right-argument of the operation
(the left-argument being the input
tensor).
"""
nn.Module.__init__(self)
self._operator = str(operator)
assert self._operator in ("+", "-", "*", "/", "**")
self._argument = float(argument)
def forward(self, x):
op = self._operator
arg = self._argument
if op == "+":
return x + arg
elif op == "-":
return x - arg
elif op == "*":
return x * arg
elif op == "/":
return x / arg
elif op == "**":
return x**arg
else:
raise ValueError("Unknown operator:" + repr(op))
def extra_repr(self):
return "operator={}, argument={}".format(repr(self._operator), self._argument)
__init__(self, operator, argument)
special
¶
__init__(...)
: Initialize the Apply module.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
operator |
str |
Must be '+', '-', '', '/', or '*'. Indicates which operation will be done on the input tensor. |
required |
argument |
float |
Expected as a float, represents the right-argument of the operation (the left-argument being the input tensor). |
required |
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, operator: str, argument: float):
"""`__init__(...)`: Initialize the Apply module.
Args:
operator: Must be '+', '-', '*', '/', or '**'.
Indicates which operation will be done
on the input tensor.
argument: Expected as a float, represents
the right-argument of the operation
(the left-argument being the input
tensor).
"""
nn.Module.__init__(self)
self._operator = str(operator)
assert self._operator in ("+", "-", "*", "/", "**")
self._argument = float(argument)
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
Bin (Module)
¶
A small torch module for binning the values of tensors.
In more details, considering a lower bound value lb, an upper bound value ub, and an input tensor x, each value within x closer to lb will be converted to lb and each value within x closer to ub will be converted to ub.
Source code in evotorch/neuroevolution/net/layers.py
class Bin(nn.Module):
"""A small torch module for binning the values of tensors.
In more details, considering a lower bound value lb,
an upper bound value ub, and an input tensor x,
each value within x closer to lb will be converted to lb
and each value within x closer to ub will be converted to ub.
"""
def __init__(self, lb: float, ub: float):
"""`__init__(...)`: Initialize the Clip operator.
Args:
lb: Lower bound
ub: Upper bound
"""
nn.Module.__init__(self)
self._lb = float(lb)
self._ub = float(ub)
self._interval_size = self._ub - self._lb
self._shrink_amount = self._interval_size / 2.0
self._shift_amount = (self._ub + self._lb) / 2.0
def forward(self, x: torch.Tensor):
x = x - self._shift_amount
x = x / self._shrink_amount
x = torch.sign(x)
x = x * self._shrink_amount
x = x + self._shift_amount
return x
def extra_repr(self):
return "lb={}, ub={}".format(self._lb, self._ub)
__init__(self, lb, ub)
special
¶
__init__(...)
: Initialize the Clip operator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lb |
float |
Lower bound |
required |
ub |
float |
Upper bound |
required |
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, lb: float, ub: float):
"""`__init__(...)`: Initialize the Clip operator.
Args:
lb: Lower bound
ub: Upper bound
"""
nn.Module.__init__(self)
self._lb = float(lb)
self._ub = float(ub)
self._interval_size = self._ub - self._lb
self._shrink_amount = self._interval_size / 2.0
self._shift_amount = (self._ub + self._lb) / 2.0
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Clip (Module)
¶
A small torch module for clipping the values of tensors
Source code in evotorch/neuroevolution/net/layers.py
class Clip(nn.Module):
"""A small torch module for clipping the values of tensors"""
def __init__(self, lb: float, ub: float):
"""`__init__(...)`: Initialize the Clip operator.
Args:
lb: Lower bound. Values less than this will be clipped.
ub: Upper bound. Values greater than this will be clipped.
"""
nn.Module.__init__(self)
self._lb = float(lb)
self._ub = float(ub)
def forward(self, x: torch.Tensor):
return x.clamp(self._lb, self._ub)
def extra_repr(self):
return "lb={}, ub={}".format(self._lb, self._ub)
__init__(self, lb, ub)
special
¶
__init__(...)
: Initialize the Clip operator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lb |
float |
Lower bound. Values less than this will be clipped. |
required |
ub |
float |
Upper bound. Values greater than this will be clipped. |
required |
Source code in evotorch/neuroevolution/net/layers.py
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
FeedForwardNet (Module)
¶
Representation of a feed forward neural network as a torch Module.
An example initialization of a FeedForwardNet is as follows:
net = drt.FeedForwardNet(4, [(8, 'tanh'), (6, 'tanh')])
which means that we would like to have a network which expects an input vector of length 4 and passes its input through 2 tanh-activated hidden layers (with neurons count 8 and 6, respectively). The output of the last hidden layer (of length 6) is the final output vector.
The string representation of the module obtained via the example above is:
FeedForwardNet(
(layer_0): Linear(in_features=4, out_features=8, bias=True)
(actfunc_0): Tanh()
(layer_1): Linear(in_features=8, out_features=6, bias=True)
(actfunc_1): Tanh()
)
Source code in evotorch/neuroevolution/net/layers.py
class FeedForwardNet(nn.Module):
"""
Representation of a feed forward neural network as a torch Module.
An example initialization of a FeedForwardNet is as follows:
net = drt.FeedForwardNet(4, [(8, 'tanh'), (6, 'tanh')])
which means that we would like to have a network which expects an input
vector of length 4 and passes its input through 2 tanh-activated hidden
layers (with neurons count 8 and 6, respectively).
The output of the last hidden layer (of length 6) is the final
output vector.
The string representation of the module obtained via the example above
is:
FeedForwardNet(
(layer_0): Linear(in_features=4, out_features=8, bias=True)
(actfunc_0): Tanh()
(layer_1): Linear(in_features=8, out_features=6, bias=True)
(actfunc_1): Tanh()
)
"""
LengthActTuple = Tuple[int, Union[str, Callable]]
LengthActBiasTuple = Tuple[int, Union[str, Callable], Union[bool]]
def __init__(self, input_size: int, layers: List[Union[LengthActTuple, LengthActBiasTuple]]):
"""`__init__(...)`: Initialize the FeedForward network.
Args:
input_size: Input size of the network, expected as an int.
layers: Expected as a list of tuples,
where each tuple is either of the form
`(layer_size, activation_function)`
or of the form
`(layer_size, activation_function, bias)`
in which
(i) `layer_size` is an int, specifying the number of neurons;
(ii) `activation_function` is None, or a callable object,
or a string containing the name of the activation function
('relu', 'selu', 'elu', 'tanh', 'hardtanh', or 'sigmoid');
(iii) `bias` is a boolean, specifying whether the layer
is to have a bias or not.
When omitted, bias is set to True.
"""
nn.Module.__init__(self)
for i, layer in enumerate(layers):
if len(layer) == 2:
size, actfunc = layer
bias = True
elif len(layer) == 3:
size, actfunc, bias = layer
else:
assert False, "A layer tuple of invalid size is encountered"
setattr(self, "layer_" + str(i), nn.Linear(input_size, size, bias=bias))
if isinstance(actfunc, str):
if actfunc == "relu":
actfunc = nn.ReLU()
elif actfunc == "selu":
actfunc = nn.SELU()
elif actfunc == "elu":
actfunc = nn.ELU()
elif actfunc == "tanh":
actfunc = nn.Tanh()
elif actfunc == "hardtanh":
actfunc = nn.Hardtanh()
elif actfunc == "sigmoid":
actfunc = nn.Sigmoid()
elif actfunc == "round":
actfunc = Round()
else:
raise ValueError("Unknown activation function: " + repr(actfunc))
setattr(self, "actfunc_" + str(i), actfunc)
input_size = size
def forward(self, x):
i = 0
while hasattr(self, "layer_" + str(i)):
x = getattr(self, "layer_" + str(i))(x)
f = getattr(self, "actfunc_" + str(i))
if f is not None:
x = f(x)
i += 1
return x
__init__(self, input_size, layers)
special
¶
__init__(...)
: Initialize the FeedForward network.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_size |
int |
Input size of the network, expected as an int. |
required |
layers |
List[Union[Tuple[int, Union[str, Callable]], Tuple[int, Union[str, Callable], bool]]] |
Expected as a list of tuples,
where each tuple is either of the form
|
required |
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, input_size: int, layers: List[Union[LengthActTuple, LengthActBiasTuple]]):
"""`__init__(...)`: Initialize the FeedForward network.
Args:
input_size: Input size of the network, expected as an int.
layers: Expected as a list of tuples,
where each tuple is either of the form
`(layer_size, activation_function)`
or of the form
`(layer_size, activation_function, bias)`
in which
(i) `layer_size` is an int, specifying the number of neurons;
(ii) `activation_function` is None, or a callable object,
or a string containing the name of the activation function
('relu', 'selu', 'elu', 'tanh', 'hardtanh', or 'sigmoid');
(iii) `bias` is a boolean, specifying whether the layer
is to have a bias or not.
When omitted, bias is set to True.
"""
nn.Module.__init__(self)
for i, layer in enumerate(layers):
if len(layer) == 2:
size, actfunc = layer
bias = True
elif len(layer) == 3:
size, actfunc, bias = layer
else:
assert False, "A layer tuple of invalid size is encountered"
setattr(self, "layer_" + str(i), nn.Linear(input_size, size, bias=bias))
if isinstance(actfunc, str):
if actfunc == "relu":
actfunc = nn.ReLU()
elif actfunc == "selu":
actfunc = nn.SELU()
elif actfunc == "elu":
actfunc = nn.ELU()
elif actfunc == "tanh":
actfunc = nn.Tanh()
elif actfunc == "hardtanh":
actfunc = nn.Hardtanh()
elif actfunc == "sigmoid":
actfunc = nn.Sigmoid()
elif actfunc == "round":
actfunc = Round()
else:
raise ValueError("Unknown activation function: " + repr(actfunc))
setattr(self, "actfunc_" + str(i), actfunc)
input_size = size
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
LSTMNet (StatefulModule)
¶
Representation of an LSTM layer.
Differently from torch.nn.LSTM, the forward pass function of this class does NOT expect the hidden state, nor does it return the resulting hidden state of the pass. Instead, the hidden states are stored within the module itself.
The forward pass function can take a 1-dimensional tensor of length
input_size, or it can take a 2-dimensional tensor of size
(batch_size, input_size)
.
Because the instances of this class are stateful, remember to reset() the internal state when needed.
Source code in evotorch/neuroevolution/net/layers.py
class LSTMNet(StatefulModule):
"""Representation of an LSTM layer.
Differently from torch.nn.LSTM, the forward pass function of this class
does NOT expect the hidden state, nor does it return
the resulting hidden state of the pass.
Instead, the hidden states are stored within the module itself.
The forward pass function can take a 1-dimensional tensor of length
input_size, or it can take a 2-dimensional tensor of size
`(batch_size, input_size)`.
Because the instances of this class are stateful,
remember to reset() the internal state when needed.
"""
def __init__(self, **kwargs):
"""
`__init__(...)`: Initialize the LSTM net.
Args:
input_size: The input size, expected as an int.
hidden_size: Number of neurons, expected as an int.
num_layers: Number of layers of the recurrent net.
"""
StatefulModule.__init__(self, nn.LSTM, **kwargs)
__init__(self, **kwargs)
special
¶
__init__(...)
: Initialize the LSTM net.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_size |
The input size, expected as an int. |
required | |
hidden_size |
Number of neurons, expected as an int. |
required | |
num_layers |
Number of layers of the recurrent net. |
required |
Source code in evotorch/neuroevolution/net/layers.py
LocomotorNet (Module)
¶
This is a control network which consists of two components: one linear, and one non-linear. The non-linear component is an input-independent set of sinusoidals waves whose amplitudes, frequencies and phases are trainable. Upon execution of a forward pass, the output of the non-linear component is the sum of all these sinusoidal waves. The linear component is a linear layer (optionally with bias) whose weights (and biases) are trainable. The final output of the LocomotorNet at the end of a forward pass is the sum of the linear and the non-linear components.
Note that this is a stateful network, where the only state
is the timestep t, which starts from 0 and gets incremented by 1
at the end of each forward pass. The reset()
method resets
t back to 0.
Reference
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018). Structured Control Nets for Deep Reinforcement Learning.
Source code in evotorch/neuroevolution/net/layers.py
class LocomotorNet(nn.Module):
"""LocomotorNet: A locomotion-specific structured control net.
This is a control network which consists of two components:
one linear, and one non-linear. The non-linear component
is an input-independent set of sinusoidals waves whose
amplitudes, frequencies and phases are trainable.
Upon execution of a forward pass, the output of the non-linear
component is the sum of all these sinusoidal waves.
The linear component is a linear layer (optionally with bias)
whose weights (and biases) are trainable.
The final output of the LocomotorNet at the end of a forward pass
is the sum of the linear and the non-linear components.
Note that this is a stateful network, where the only state
is the timestep t, which starts from 0 and gets incremented by 1
at the end of each forward pass. The `reset()` method resets
t back to 0.
Reference:
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018).
Structured Control Nets for Deep Reinforcement Learning.
"""
def __init__(self, *, in_features: int, out_features: int, bias: bool = True, num_sinusoids=16):
"""`__init__(...)`: Initialize the LocomotorNet.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
bias: Whether or not the linear component is to have a bias
num_sinusoids: Number of sinusoidal waves
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._bias = bias
self._num_sinusoids = num_sinusoids
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._amplitudes = nn.ParameterList()
self._frequencies = nn.ParameterList()
self._phases = nn.ParameterList()
for _ in range(self._num_sinusoids):
for paramlist in (self._amplitudes, self._frequencies, self._phases):
paramlist.append(nn.Parameter(torch.randn(self._out_features, dtype=torch.float32)))
self.reset()
def reset(self):
"""Set the timestep t to 0"""
self._t = 0
@property
def t(self) -> int:
"""The current timestep t"""
return self._t
@property
def in_features(self) -> int:
"""Get the length of the input vector"""
return self._in_features
@property
def out_features(self) -> int:
"""Get the length of the output vector"""
return self._out_features
@property
def num_sinusoids(self) -> int:
"""Get the number of sinusoidal waves of the non-linear component"""
return self._num_sinusoids
@property
def bias(self) -> bool:
"""Get whether or not the linear component has bias"""
return self._bias
def forward(self, x: torch.Tensor) -> torch.Tensor:
"""Execute a forward pass"""
u_linear = self._linear_component(x)
t = self._t
u_nonlinear = torch.zeros(self._out_features)
for i in range(self._num_sinusoids):
A = self._amplitudes[i]
w = self._frequencies[i]
phi = self._phases[i]
u_nonlinear = u_nonlinear + (A * torch.sin(w * t + phi))
self._t += 1
return u_linear + u_nonlinear
bias: bool
property
readonly
¶
Get whether or not the linear component has bias
in_features: int
property
readonly
¶
Get the length of the input vector
num_sinusoids: int
property
readonly
¶
Get the number of sinusoidal waves of the non-linear component
out_features: int
property
readonly
¶
Get the length of the output vector
t: int
property
readonly
¶
The current timestep t
__init__(self, *, in_features, out_features, bias=True, num_sinusoids=16)
special
¶
__init__(...)
: Initialize the LocomotorNet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_features |
int |
Length of the input vector |
required |
out_features |
int |
Length of the output vector |
required |
bias |
bool |
Whether or not the linear component is to have a bias |
True |
num_sinusoids |
Number of sinusoidal waves |
16 |
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, *, in_features: int, out_features: int, bias: bool = True, num_sinusoids=16):
"""`__init__(...)`: Initialize the LocomotorNet.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
bias: Whether or not the linear component is to have a bias
num_sinusoids: Number of sinusoidal waves
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._bias = bias
self._num_sinusoids = num_sinusoids
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._amplitudes = nn.ParameterList()
self._frequencies = nn.ParameterList()
self._phases = nn.ParameterList()
for _ in range(self._num_sinusoids):
for paramlist in (self._amplitudes, self._frequencies, self._phases):
paramlist.append(nn.Parameter(torch.randn(self._out_features, dtype=torch.float32)))
self.reset()
forward(self, x)
¶
Execute a forward pass
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x: torch.Tensor) -> torch.Tensor:
"""Execute a forward pass"""
u_linear = self._linear_component(x)
t = self._t
u_nonlinear = torch.zeros(self._out_features)
for i in range(self._num_sinusoids):
A = self._amplitudes[i]
w = self._frequencies[i]
phi = self._phases[i]
u_nonlinear = u_nonlinear + (A * torch.sin(w * t + phi))
self._t += 1
return u_linear + u_nonlinear
reset(self)
¶
RecurrentNet (StatefulModule)
¶
Representation of a fully connected recurrent net as a torch Module.
Differently from torch.nn.RNN, the forward pass function of this class does NOT expect the hidden state, nor does it return the resulting hidden state of the pass. Instead, the hidden states are stored within the module itself.
The forward pass function can take a 1-dimensional tensor of length input_size, or it can take a 2-dimensional tensor of size (batch_size, input_size).
Because the instances of this class are stateful, remember to reset() the internal state when needed.
Source code in evotorch/neuroevolution/net/layers.py
class RecurrentNet(StatefulModule):
"""Representation of a fully connected recurrent net as a torch Module.
Differently from torch.nn.RNN, the forward pass function of this class
does NOT expect the hidden state, nor does it return
the resulting hidden state of the pass.
Instead, the hidden states are stored within the module itself.
The forward pass function can take a 1-dimensional tensor of length
input_size, or it can take a 2-dimensional tensor of size
(batch_size, input_size).
Because the instances of this class are stateful,
remember to reset() the internal state when needed.
"""
def __init__(self, **kwargs):
"""
`__init__(...)`: Initialize the recurrent net.
Args:
input_size: The input size, expected as an int.
hidden_size: Number of neurons, expected as an int.
nonlinearity: The activation function,
expected as 'tanh' or 'relu'.
num_layers: Number of layers of the recurrent net.
"""
StatefulModule.__init__(self, nn.RNN, **kwargs)
__init__(self, **kwargs)
special
¶
__init__(...)
: Initialize the recurrent net.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_size |
The input size, expected as an int. |
required | |
hidden_size |
Number of neurons, expected as an int. |
required | |
nonlinearity |
The activation function, expected as 'tanh' or 'relu'. |
required | |
num_layers |
Number of layers of the recurrent net. |
required |
Source code in evotorch/neuroevolution/net/layers.py
def __init__(self, **kwargs):
"""
`__init__(...)`: Initialize the recurrent net.
Args:
input_size: The input size, expected as an int.
hidden_size: Number of neurons, expected as an int.
nonlinearity: The activation function,
expected as 'tanh' or 'relu'.
num_layers: Number of layers of the recurrent net.
"""
StatefulModule.__init__(self, nn.RNN, **kwargs)
Round (Module)
¶
A small torch module for rounding the values of an input tensor
Source code in evotorch/neuroevolution/net/layers.py
class Round(nn.Module):
"""A small torch module for rounding the values of an input tensor"""
def __init__(self, ndigits: int = 0):
nn.Module.__init__(self)
self._ndigits = int(ndigits)
self._q = 10.0**self._ndigits
def forward(self, x):
x = x * self._q
x = torch.round(x)
x = x / self._q
return x
def extra_repr(self):
return "ndigits=" + str(self._ndigits)
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Slice (Module)
¶
A small torch module for getting the slice of an input tensor
Source code in evotorch/neuroevolution/net/layers.py
class Slice(nn.Module):
"""A small torch module for getting the slice of an input tensor"""
def __init__(self, from_index: int, to_index: int):
"""`__init__(...)`: Initialize the Slice operator.
Args:
from_index: The index from which the slice begins.
to_index: The exclusive index at which the slice ends.
"""
nn.Module.__init__(self)
self._from_index = from_index
self._to_index = to_index
def forward(self, x):
return x[self._from_index : self._to_index]
def extra_repr(self):
return "from_index={}, to_index={}".format(self._from_index, self._to_index)
__init__(self, from_index, to_index)
special
¶
__init__(...)
: Initialize the Slice operator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
from_index |
int |
The index from which the slice begins. |
required |
to_index |
int |
The exclusive index at which the slice ends. |
required |
Source code in evotorch/neuroevolution/net/layers.py
extra_repr(self)
¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
StatefulModule (Module)
¶
Base class for stateful modules. Not to be instantiated directly.
Source code in evotorch/neuroevolution/net/layers.py
class StatefulModule(nn.Module):
"""Base class for stateful modules.
Not to be instantiated directly.
"""
def __init__(self, module_class, **kwargs):
nn.Module.__init__(self)
assert "batch_first" not in kwargs, "The `batch_first` option is not supported"
self._layer = module_class(**kwargs)
self.reset()
@property
def state(self):
"""Get the tensor of the internal state.
If the recurrent network is just initialized or reset,
then there is no state, so, a None is given.
Not having a state means that an initial internal state tensor of
compatible size with the input will be created at the
first usage of this network.
Each element of this initial internal state tensor is 0.
"""
return self._state
def reset(self):
"""Reset the internal state"""
self._state = None
def forward(self, x):
if len(x.shape) == 1:
input_size = x.shape[0]
x = x.view(1, 1, input_size)
batch_size = 1
orgdim = 1
elif len(x.shape) == 2:
batch_size, input_size = x.shape
x = x.view(1, batch_size, input_size)
orgdim = 2
else:
assert False, (
"expected a tensor with 1 or 2 dimensions, " + "but received a tensor of shape " + str(x.shape)
)
if self._state is None:
x, self._state = self._layer(x)
else:
x, self._state = self._layer(x, self._state)
if orgdim == 1:
x = x.view(-1)
elif orgdim == 2:
x = x.view(batch_size, -1)
else:
assert False, "unknown value for orgdim"
return x
@property
def batch_first(self):
"""Return True if the module expects the batch dimension first.
Otherwise, return False.
"""
return self._layer.batch_first
batch_first
property
readonly
¶
Return True if the module expects the batch dimension first. Otherwise, return False.
state
property
readonly
¶
Get the tensor of the internal state. If the recurrent network is just initialized or reset, then there is no state, so, a None is given. Not having a state means that an initial internal state tensor of compatible size with the input will be created at the first usage of this network. Each element of this initial internal state tensor is 0.
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Source code in evotorch/neuroevolution/net/layers.py
def forward(self, x):
if len(x.shape) == 1:
input_size = x.shape[0]
x = x.view(1, 1, input_size)
batch_size = 1
orgdim = 1
elif len(x.shape) == 2:
batch_size, input_size = x.shape
x = x.view(1, batch_size, input_size)
orgdim = 2
else:
assert False, (
"expected a tensor with 1 or 2 dimensions, " + "but received a tensor of shape " + str(x.shape)
)
if self._state is None:
x, self._state = self._layer(x)
else:
x, self._state = self._layer(x, self._state)
if orgdim == 1:
x = x.view(-1)
elif orgdim == 2:
x = x.view(batch_size, -1)
else:
assert False, "unknown value for orgdim"
return x
reset(self)
¶
StructuredControlNet (Module)
¶
Structured Control Net.
This is a control network consisting of two components: (i) a non-linear component, which is a feed-forward network; and (ii) a linear component, which is a linear layer. Both components take the input vector provided to the structured control network. The final output is the sum of the outputs of both components.
Reference
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018). Structured Control Nets for Deep Reinforcement Learning.
Source code in evotorch/neuroevolution/net/layers.py
class StructuredControlNet(nn.Module):
"""Structured Control Net.
This is a control network consisting of two components:
(i) a non-linear component, which is a feed-forward network; and
(ii) a linear component, which is a linear layer.
Both components take the input vector provided to the
structured control network.
The final output is the sum of the outputs of both components.
Reference:
Mario Srouji, Jian Zhang, Ruslan Salakhutdinov (2018).
Structured Control Nets for Deep Reinforcement Learning.
"""
def __init__(
self,
*,
in_features: int,
out_features: int,
num_layers: int,
hidden_size: int,
bias: bool = True,
nonlinearity: Union[str, Callable] = "tanh",
):
"""`__init__(...)`: Initialize the structured control net.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
num_layers: Number of hidden layers for the non-linear component
hidden_size: Number of neurons in a hidden layer of the
non-linear component
bias: Whether or not the linear component is to have bias
nonlinearity: Activation function
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._num_layers = num_layers
self._hidden_size = hidden_size
self._bias = bias
self._nonlinearity = nonlinearity
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._nonlinear_component = FeedForwardNet(
input_size=self._in_features,
layers=(
list((self._hidden_size, self._nonlinearity) for _ in range(self._num_layers))
+ [(self._out_features, self._nonlinearity)]
),
)
def forward(self, x: torch.Tensor) -> torch.Tensor:
"""TODO: documentation"""
return self._linear_component(x) + self._nonlinear_component(x)
@property
def in_features(self):
"""TODO: documentation"""
return self._in_features
@property
def out_features(self):
"""TODO: documentation"""
return self._out_features
@property
def num_layers(self):
"""TODO: documentation"""
return self._num_layers
@property
def hidden_size(self):
"""TODO: documentation"""
return self._hidden_size
@property
def bias(self):
"""TODO: documentation"""
return self._bias
@property
def nonlinearity(self):
"""TODO: documentation"""
return self._nonlinearity
bias
property
readonly
¶
hidden_size
property
readonly
¶
in_features
property
readonly
¶
nonlinearity
property
readonly
¶
num_layers
property
readonly
¶
out_features
property
readonly
¶
__init__(self, *, in_features, out_features, num_layers, hidden_size, bias=True, nonlinearity='tanh')
special
¶
__init__(...)
: Initialize the structured control net.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_features |
int |
Length of the input vector |
required |
out_features |
int |
Length of the output vector |
required |
num_layers |
int |
Number of hidden layers for the non-linear component |
required |
hidden_size |
int |
Number of neurons in a hidden layer of the non-linear component |
required |
bias |
bool |
Whether or not the linear component is to have bias |
True |
nonlinearity |
Union[str, Callable] |
Activation function |
'tanh' |
Source code in evotorch/neuroevolution/net/layers.py
def __init__(
self,
*,
in_features: int,
out_features: int,
num_layers: int,
hidden_size: int,
bias: bool = True,
nonlinearity: Union[str, Callable] = "tanh",
):
"""`__init__(...)`: Initialize the structured control net.
Args:
in_features: Length of the input vector
out_features: Length of the output vector
num_layers: Number of hidden layers for the non-linear component
hidden_size: Number of neurons in a hidden layer of the
non-linear component
bias: Whether or not the linear component is to have bias
nonlinearity: Activation function
"""
nn.Module.__init__(self)
self._in_features = in_features
self._out_features = out_features
self._num_layers = num_layers
self._hidden_size = hidden_size
self._bias = bias
self._nonlinearity = nonlinearity
self._linear_component = nn.Linear(
in_features=self._in_features, out_features=self._out_features, bias=self._bias
)
self._nonlinear_component = FeedForwardNet(
input_size=self._in_features,
layers=(
list((self._hidden_size, self._nonlinearity) for _ in range(self._num_layers))
+ [(self._out_features, self._nonlinearity)]
),
)
forward(self, x)
¶
reset_module_state(net)
¶
Reset a torch module's state by calling its reset() method.
If the module is a torch.nn.Sequential, then the function applies itself recursively to the submodules of the Sequential net. If the module does not have a reset() method, nothing happens.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
net |
Module |
The torch module whose state will be reset. |
required |
Source code in evotorch/neuroevolution/net/layers.py
def reset_module_state(net: nn.Module):
"""
Reset a torch module's state by calling its reset() method.
If the module is a torch.nn.Sequential, then the function
applies itself recursively to the submodules of the Sequential net.
If the module does not have a reset() method, nothing happens.
Args:
net: The torch module whose state will be reset.
"""
if hasattr(net, "reset"):
net.reset()
elif isinstance(net, nn.Sequential):
for i_module in range(len(net)):
reset_module_state(net[i_module])
misc
¶
Utilities for reading and for writing neural network parameters
count_parameters(net)
¶
Get the number of parameters the network.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
net |
Module |
The torch module whose parameters will be counted. |
required |
Returns:
Type | Description |
---|---|
int |
The number of parameters, as an integer. |
Source code in evotorch/neuroevolution/net/misc.py
fill_parameters(net, vector)
¶
Fill the parameters of a torch module (net) from a vector.
No gradient information is kept.
The vector's length must be exactly the same with the number of parameters of the PyTorch module.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
net |
Module |
The torch module whose parameter values will be filled. |
required |
vector |
Tensor |
A 1-D torch tensor which stores the parameter values. |
required |
Source code in evotorch/neuroevolution/net/misc.py
@torch.no_grad()
def fill_parameters(net: nn.Module, vector: torch.Tensor):
"""Fill the parameters of a torch module (net) from a vector.
No gradient information is kept.
The vector's length must be exactly the same with the number
of parameters of the PyTorch module.
Args:
net: The torch module whose parameter values will be filled.
vector: A 1-D torch tensor which stores the parameter values.
"""
address = 0
for p in net.parameters():
d = p.data.view(-1)
n = len(d)
d[:] = torch.as_tensor(vector[address : address + n], device=d.device)
address += n
if address != len(vector):
raise IndexError("The parameter vector is larger than expected")
parameter_vector(net, *, device=None)
¶
Get all the parameters of a torch module (net) into a vector
No gradient information is kept.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
net |
Module |
The torch module whose parameters will be extracted. |
required |
device |
Union[str, torch.device] |
The device in which the parameter vector will be constructed. If the network has parameter across multiple devices, you can specify this argument so that concatenation of all the parameters will be successful. |
None |
Returns:
Type | Description |
---|---|
Tensor |
The parameters of the module in a 1-D tensor. |
Source code in evotorch/neuroevolution/net/misc.py
@torch.no_grad()
def parameter_vector(net: nn.Module, *, device: Optional[Device] = None) -> torch.Tensor:
"""Get all the parameters of a torch module (net) into a vector
No gradient information is kept.
Args:
net: The torch module whose parameters will be extracted.
device: The device in which the parameter vector will be constructed.
If the network has parameter across multiple devices,
you can specify this argument so that concatenation of all the
parameters will be successful.
Returns:
The parameters of the module in a 1-D tensor.
"""
dev_kwarg = {} if device is None else {"device": device}
all_vectors = []
for p in net.parameters():
all_vectors.append(torch.as_tensor(p.data.view(-1), **dev_kwarg))
return torch.cat(all_vectors)
parser
¶
Utilities for parsing string representations of neural net policies
NetParsingError (Exception)
¶
Representation of a parsing error
Source code in evotorch/neuroevolution/net/parser.py
class NetParsingError(Exception):
"""
Representation of a parsing error
"""
def __init__(
self,
message: str,
lineno: Optional[int] = None,
col_offset: Optional[int] = None,
original_error: Optional[Exception] = None,
):
"""
`__init__(...)`: Initialize the NetParsingError.
Args:
message: Error message, as string.
lineno: Erroneous line number in the string representation of the
neural network structure.
col_offset: Erroneous column number in the string representation
of the neural network structure.
original_error: If another error caused this parsing error,
that original error can be attached to this `NetParsingError`
instance via this argument.
"""
super().__init__()
self.message = message
self.lineno = lineno
self.col_offset = col_offset
self.original_error = original_error
def _to_string(self) -> str:
parts = []
parts.append(type(self).__name__)
if self.lineno is not None:
parts.append(" at line(")
parts.append(str(self.lineno - 1))
parts.append(")")
if self.col_offset is not None:
parts.append(" at column(")
parts.append(str(self.col_offset + 1))
parts.append(")")
parts.append(": ")
parts.append(self.message)
return "".join(parts)
def __str__(self) -> str:
return self._to_string()
def __repr__(self) -> str:
return self._to_string()
__init__(self, message, lineno=None, col_offset=None, original_error=None)
special
¶
__init__(...)
: Initialize the NetParsingError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
message |
str |
Error message, as string. |
required |
lineno |
Optional[int] |
Erroneous line number in the string representation of the neural network structure. |
None |
col_offset |
Optional[int] |
Erroneous column number in the string representation of the neural network structure. |
None |
original_error |
Optional[Exception] |
If another error caused this parsing error,
that original error can be attached to this |
None |
Source code in evotorch/neuroevolution/net/parser.py
def __init__(
self,
message: str,
lineno: Optional[int] = None,
col_offset: Optional[int] = None,
original_error: Optional[Exception] = None,
):
"""
`__init__(...)`: Initialize the NetParsingError.
Args:
message: Error message, as string.
lineno: Erroneous line number in the string representation of the
neural network structure.
col_offset: Erroneous column number in the string representation
of the neural network structure.
original_error: If another error caused this parsing error,
that original error can be attached to this `NetParsingError`
instance via this argument.
"""
super().__init__()
self.message = message
self.lineno = lineno
self.col_offset = col_offset
self.original_error = original_error
str_to_net(s, **constants)
¶
Read a string representation of a neural net structure,
and return a torch.nn.Module
instance out of it.
Let us imagine that one wants to describe the following neural network structure:
from torch import nn
net = nn.Sequential(
nn.Linear(8, 16),
nn.Tanh(),
nn.Linear(16, 4, bias=False),
nn.ReLU()
)
By using str_to_net(...)
one can construct the same
module via:
from evotorch.neuroevolution.net import str_to_net
net = str_to_net(
'Linear(8, 16) >> Tanh() >> Linear(16, 4, bias=False) >> ReLU()'
)
The string can also be multi-line:
One can also define constants for using them in strings:
net = str_to_net(
'''
Linear(input_size, hidden_size)
>> Tanh()
>> Linear(hidden_size, output_size, bias=False)
>> ReLU()
''',
input_size=8,
hidden_size=16,
output_size=4
)
In the neural net structure string, when one refers to a module type,
say, Linear
, first the name Linear
is searched for in the namespace
evotorch.neuroevolution.net.layers
, and then in the namespace torch.nn
.
In the case of Linear
, the searched name exists in torch.nn
,
and therefore, the layer type to be instantiated is accepted as
torch.nn.Linear
.
Instead of Linear
, if one had used the name, say,
StructuredControlNet
, then, the layer type to be instantiated
would be evotorch.neuroevolution.net.layers.StructuredControlNet
.
Notes regarding usage with evotorch.neuroevolution.GymNE
:
While instantiating a GymNE
, one can specify a neural net
structure string as the policy. Therefore, while filling the policy
string for a GymNE
, all these rules mentioned above apply.
Additionally, while using str_to_net(...)
internally,
GymNE
defines these extra constants:
obs_length
(length of the observation vector),
act_length
(length of the action vector for continuous-action
environments, or number of actions for discrete-action
environments), and obs_shape
(shape of the observation as a tuple,
assuming that the observation space is of type gym.spaces.Box
,
usable within the string like obs_shape[0]
, obs_shape[1]
, etc.,
or simply obs_shape
to refer to the entire tuple).
Therefore, while using with GymNE
, one can define a
single-hidden-layered policy via this string:
(where one might choose to omit the last Tanh()
as GymNE
will clip the output of the final layer to conform with the
action boundaries of the environment, which one might think as a
type of hard-tanh anyway).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s |
str |
The string which expresses the neural net structure. |
required |
Returns:
Type | Description |
---|---|
Module |
The PyTorch module of the specified structure. |
Source code in evotorch/neuroevolution/net/parser.py
def str_to_net(s: str, **constants) -> nn.Module:
"""
Read a string representation of a neural net structure,
and return a `torch.nn.Module` instance out of it.
Let us imagine that one wants to describe the following
neural network structure:
```python
from torch import nn
net = nn.Sequential(
nn.Linear(8, 16),
nn.Tanh(),
nn.Linear(16, 4, bias=False),
nn.ReLU()
)
```
By using `str_to_net(...)` one can construct the same
module via:
```python
from evotorch.neuroevolution.net import str_to_net
net = str_to_net(
'Linear(8, 16) >> Tanh() >> Linear(16, 4, bias=False) >> ReLU()'
)
```
The string can also be multi-line:
```python
net = str_to_net(
'''
Linear(8, 16)
>> Tanh()
>> Linear(16, 4, bias=False)
>> ReLU()
'''
)
```
One can also define constants for using them in strings:
```python
net = str_to_net(
'''
Linear(input_size, hidden_size)
>> Tanh()
>> Linear(hidden_size, output_size, bias=False)
>> ReLU()
''',
input_size=8,
hidden_size=16,
output_size=4
)
```
In the neural net structure string, when one refers to a module type,
say, `Linear`, first the name `Linear` is searched for in the namespace
`evotorch.neuroevolution.net.layers`, and then in the namespace `torch.nn`.
In the case of `Linear`, the searched name exists in `torch.nn`,
and therefore, the layer type to be instantiated is accepted as
`torch.nn.Linear`.
Instead of `Linear`, if one had used the name, say,
`StructuredControlNet`, then, the layer type to be instantiated
would be `evotorch.neuroevolution.net.layers.StructuredControlNet`.
**Notes regarding usage with `evotorch.neuroevolution.GymNE`:**
While instantiating a `GymNE`, one can specify a neural net
structure string as the policy. Therefore, while filling the policy
string for a `GymNE`, all these rules mentioned above apply.
Additionally, while using `str_to_net(...)` internally,
`GymNE` defines these extra constants:
`obs_length` (length of the observation vector),
`act_length` (length of the action vector for continuous-action
environments, or number of actions for discrete-action
environments), and `obs_shape` (shape of the observation as a tuple,
assuming that the observation space is of type `gym.spaces.Box`,
usable within the string like `obs_shape[0]`, `obs_shape[1]`, etc.,
or simply `obs_shape` to refer to the entire tuple).
Therefore, while using with `GymNE`, one can define a
single-hidden-layered policy via this string:
```python
'Linear(obs_length, 16) >> Tanh() >> Linear(16, act_length) >> Tanh()'
```
(where one might choose to omit the last `Tanh()` as `GymNE`
will clip the output of the final layer to conform with the
action boundaries of the environment, which one might think as a
type of hard-tanh anyway).
Args:
s: The string which expresses the neural net structure.
Returns:
The PyTorch module of the specified structure.
"""
s = f"(\n{s}\n)"
return _process_expr(ast.parse(s, mode="eval").body, constants=constants)
rl
¶
This namespace various RL-specific utilities.
ActClipLayer (Module)
¶
Source code in evotorch/neuroevolution/net/rl.py
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
ObsNormLayer (Module)
¶
Observation normalization layer for a policy network
Source code in evotorch/neuroevolution/net/rl.py
class ObsNormLayer(nn.Module):
"""Observation normalization layer for a policy network"""
def __init__(self, stats: RunningStat, trainable_stats: bool):
"""`__init__(...)`: Initialize the observation normalization layer
Args:
stats: The RunninStat object storing the mean and stdev of
all of the observations.
trainable_stats: Whether or not the normalization data
are to be stored as trainable parameters.
"""
nn.Module.__init__(self)
mean = torch.tensor(stats.mean, dtype=torch.float32)
stdev = torch.tensor(stats.stdev, dtype=torch.float32)
if trainable_stats:
self.obs_mean = nn.Parameter(mean)
self.obs_stdev = nn.Parameter(stdev)
else:
self.obs_mean = mean
self.obs_stdev = stdev
def forward(self, x):
x = x - self.obs_mean
x = x / self.obs_stdev
return x
__init__(self, stats, trainable_stats)
special
¶
__init__(...)
: Initialize the observation normalization layer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stats |
RunningStat |
The RunninStat object storing the mean and stdev of all of the observations. |
required |
trainable_stats |
bool |
Whether or not the normalization data are to be stored as trainable parameters. |
required |
Source code in evotorch/neuroevolution/net/rl.py
def __init__(self, stats: RunningStat, trainable_stats: bool):
"""`__init__(...)`: Initialize the observation normalization layer
Args:
stats: The RunninStat object storing the mean and stdev of
all of the observations.
trainable_stats: Whether or not the normalization data
are to be stored as trainable parameters.
"""
nn.Module.__init__(self)
mean = torch.tensor(stats.mean, dtype=torch.float32)
stdev = torch.tensor(stats.stdev, dtype=torch.float32)
if trainable_stats:
self.obs_mean = nn.Parameter(mean)
self.obs_stdev = nn.Parameter(stdev)
else:
self.obs_mean = mean
self.obs_stdev = stdev
forward(self, x)
¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note::
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:Module
instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
reset_env(env)
¶
Reset a gym environment.
For gym 1.0, the plan is to have a reset(...)
method which returns
a two-element tuple (observation, info)
where info
is an object
providing any additional information regarding the initial state of
the agent. However, the old (pre 1.0) gym API (and some environments
which were written with old gym compatibility in mind) has (or have)
a reset(...)
method which returns a single object that is the
initial observation.
With the assumption that the observation space of the environment
is NOT tuple, this function can work with both pre-1.0 and (hopefully)
after-1.0 versions of gym, and always returns the initial observation.
Please do not use this function on environments whose observation
spaces or tuples, because then this function cannot distinguish between
environments whose reset(...)
methods return a tuple and environments
whose reset(...)
methods return a single observation object but that
observation object is a tuple.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
env |
Env |
The gym environment which will be reset. |
required |
Returns:
Type | Description |
---|---|
Iterable |
The initial observation |
Source code in evotorch/neuroevolution/net/rl.py
def reset_env(env: gym.Env) -> Iterable:
"""
Reset a gym environment.
For gym 1.0, the plan is to have a `reset(...)` method which returns
a two-element tuple `(observation, info)` where `info` is an object
providing any additional information regarding the initial state of
the agent. However, the old (pre 1.0) gym API (and some environments
which were written with old gym compatibility in mind) has (or have)
a `reset(...)` method which returns a single object that is the
initial observation.
With the assumption that the observation space of the environment
is NOT tuple, this function can work with both pre-1.0 and (hopefully)
after-1.0 versions of gym, and always returns the initial observation.
Please do not use this function on environments whose observation
spaces or tuples, because then this function cannot distinguish between
environments whose `reset(...)` methods return a tuple and environments
whose `reset(...)` methods return a single observation object but that
observation object is a tuple.
Args:
env: The gym environment which will be reset.
Returns:
The initial observation
"""
result = env.reset()
if isinstance(result, tuple) and (len(result) == 2):
result = result[0]
return result
take_step_in_env(env, action)
¶
Take a step in the gym environment. Taking a step means performing the action provided via the arguments.
For gym 1.0, the plan is to have a step(...)
method which returns a
5-elements tuple containing observation
, reward
, terminated
,
truncated
, info
where terminated
is a boolean indicating whether
or not the episode is terminated because of the actions taken within the
environment, and truncated
is a boolean indicating whether or not the
episode is finished because the time limit is reached.
However, the old (pre 1.0) gym API (and some environments which were
written with old gym compatibility in mind) has (or have) a step(...)
method which returns 4 elements: observation
, reward
, done
, info
where done
is a boolean indicating whether or not the episode is
"done", either because of termination or because of truncation.
This function can work with both pre-1.0 and (hopefully) after-1.0
versions of gym, and always returns the 4-element tuple as its result.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
env |
Env |
The gym environment in which the given action will be performed. |
required |
Returns:
Type | Description |
---|---|
tuple |
A tuple in the form |
Source code in evotorch/neuroevolution/net/rl.py
def take_step_in_env(env: gym.Env, action: Iterable) -> tuple:
"""
Take a step in the gym environment.
Taking a step means performing the action provided via the arguments.
For gym 1.0, the plan is to have a `step(...)` method which returns a
5-elements tuple containing `observation`, `reward`, `terminated`,
`truncated`, `info` where `terminated` is a boolean indicating whether
or not the episode is terminated because of the actions taken within the
environment, and `truncated` is a boolean indicating whether or not the
episode is finished because the time limit is reached.
However, the old (pre 1.0) gym API (and some environments which were
written with old gym compatibility in mind) has (or have) a `step(...)`
method which returns 4 elements: `observation`, `reward`, `done`, `info`
where `done` is a boolean indicating whether or not the episode is
"done", either because of termination or because of truncation.
This function can work with both pre-1.0 and (hopefully) after-1.0
versions of gym, and always returns the 4-element tuple as its result.
Args:
env: The gym environment in which the given action will be performed.
Returns:
A tuple in the form `(observation, reward, done, info)` where
`observation` is the observation received after performing the action,
`reward` is the amount of reward gained,
`done` is a boolean value indicating whether or not the episode has
ended, and
`info` is additional information (usually as a dictionary).
"""
result = env.step(action)
if isinstance(result, tuple):
n = len(result)
if n == 4:
observation, reward, done, info = result
elif n == 5:
observation, reward, terminated, truncated, info = result
done = terminated or truncated
else:
raise ValueError(
f"The result of the `step(...)` method of the gym environment"
f" was expected as a tuple of length 4 or 5."
f" However, the received result is {repr(result)}, which is"
f" of length {len(result)}."
)
else:
raise TypeError(
f"The result of the `step(...)` method of the gym environment"
f" was expected as a tuple of length 4 or 5."
f" However, the received result is {repr(result)}, which is"
f" of type {type(result)}."
)
return observation, reward, done, info
runningstat
¶
RunningStat
¶
Tool for efficiently computing the mean and stdev of arrays. The arrays themselves are not stored separately, instead, they are accumulated.
Source code in evotorch/neuroevolution/net/runningstat.py
class RunningStat:
"""
Tool for efficiently computing the mean and stdev of arrays.
The arrays themselves are not stored separately,
instead, they are accumulated.
"""
def __init__(self):
"""
``__init__(...)``: Initialize the RunningStat.
In the beginning, the number of arrays is 0,
and the sum and the sum of squares are set as NaN.
"""
# self.sum = np.zeros(shape, dtype='float32')
# self.sumsq = np.full(shape, eps, dtype='float32')
# self.count = eps
self.reset()
def reset(self):
"""
Reset the RunningStat to its initial state.
"""
self._sum = float("nan")
self._sumsq = float("nan")
self._count = 0
def _increment(self, s, ssq, c):
# self.sum += s
# self.sumsq += ssq
# self.count += c
if self._count == 0:
self._sum = np.array(s, dtype="float32")
self._sumsq = np.array(ssq, dtype="float32")
else:
self._sum += s
self._sumsq += ssq
self._count += c
@property
def count(self) -> int:
"""
Get the number of arrays accumulated.
"""
return self._count
@property
def sum(self) -> np.ndarray:
"""
Get the sum of all accumulated arrays.
"""
return self._sum
@property
def sum_of_squares(self) -> np.ndarray:
"""
Get the sum of squares of all accumulated arrays.
"""
return self._sumsq
@property
def mean(self) -> np.ndarray:
"""
Get the mean of all accumulated arrays.
"""
return self._sum / self._count
@property
def stdev(self) -> np.ndarray:
"""
Get the standard deviation of all accumulated arrays.
"""
return np.sqrt(np.maximum(self._sumsq / self._count - np.square(self.mean), 1e-2))
# def _set_from_init(self, init_mean, init_std, init_count):
# init_mean = np.asarray(init_mean, dtype='float32')
# init_std = np.asarray(init_std, dtype='float32')
# self._sum = init_mean * init_count
# self._sumsq = (np.square(init_mean) + np.square(init_std)) * init_count
# self._count = init_count
def update(self, x: Union[np.ndarray, "RunningStat"]):
"""
Accumulate more data into the RunningStat object.
If the argument is an array, that array is added
as one more data element.
If the argument is another RunningStat instance,
all the stats accumulated by that RunningStat object
are added into this RunningStat object.
"""
if isinstance(x, RunningStat):
if x.count > 0:
self._increment(x.sum, x.sum_of_squares, x.count)
else:
self._increment(x, np.square(x), 1)
def normalize(self, x: Union[np.ndarray, list]) -> np.ndarray:
"""
Normalize the array x according to the accumulated stats.
"""
x = np.array(x, dtype="float32")
x -= self.mean
x /= self.stdev
return x
def __copy__(self):
return deepcopy(self)
def __get_repr(self):
return "<RunningStat, count: " + str(self._count) + ">"
def __str__(self):
return self.__get_repr()
def __repr__(self):
return self.__get_repr()
count: int
property
readonly
¶
Get the number of arrays accumulated.
mean: ndarray
property
readonly
¶
Get the mean of all accumulated arrays.
stdev: ndarray
property
readonly
¶
Get the standard deviation of all accumulated arrays.
sum: ndarray
property
readonly
¶
Get the sum of all accumulated arrays.
sum_of_squares: ndarray
property
readonly
¶
Get the sum of squares of all accumulated arrays.
__init__(self)
special
¶
__init__(...)
: Initialize the RunningStat.
In the beginning, the number of arrays is 0, and the sum and the sum of squares are set as NaN.
Source code in evotorch/neuroevolution/net/runningstat.py
def __init__(self):
"""
``__init__(...)``: Initialize the RunningStat.
In the beginning, the number of arrays is 0,
and the sum and the sum of squares are set as NaN.
"""
# self.sum = np.zeros(shape, dtype='float32')
# self.sumsq = np.full(shape, eps, dtype='float32')
# self.count = eps
self.reset()
normalize(self, x)
¶
Normalize the array x according to the accumulated stats.
reset(self)
¶
update(self, x)
¶
Accumulate more data into the RunningStat object. If the argument is an array, that array is added as one more data element. If the argument is another RunningStat instance, all the stats accumulated by that RunningStat object are added into this RunningStat object.
Source code in evotorch/neuroevolution/net/runningstat.py
def update(self, x: Union[np.ndarray, "RunningStat"]):
"""
Accumulate more data into the RunningStat object.
If the argument is an array, that array is added
as one more data element.
If the argument is another RunningStat instance,
all the stats accumulated by that RunningStat object
are added into this RunningStat object.
"""
if isinstance(x, RunningStat):
if x.count > 0:
self._increment(x.sum, x.sum_of_squares, x.count)
else:
self._increment(x, np.square(x), 1)
supervisedne
¶
SupervisedNE (NEProblem)
¶
Representation of a neuro-evolution problem where the goal is to minimize a loss function in a supervised learning setting.
A supervised learning problem can be defined via subclassing this class
and overriding the methods
_loss(y_hat, y)
(which is to define how the loss is computed)
and _make_dataloader()
(which is to define how a new DataLoader is
created).
Alternatively, this class can be directly instantiated as follows:
def my_loss_function(output_of_network, desired_output):
loss = ... # compute the loss here
return loss
problem = SupervisedNE(
my_dataset,
MyTorchModuleClass,
my_loss_function,
minibatch_size=...,
...
)
Source code in evotorch/neuroevolution/supervisedne.py
class SupervisedNE(NEProblem):
"""
Representation of a neuro-evolution problem where the goal is to minimize
a loss function in a supervised learning setting.
A supervised learning problem can be defined via subclassing this class
and overriding the methods
`_loss(y_hat, y)` (which is to define how the loss is computed)
and `_make_dataloader()` (which is to define how a new DataLoader is
created).
Alternatively, this class can be directly instantiated as follows:
```python
def my_loss_function(output_of_network, desired_output):
loss = ... # compute the loss here
return loss
problem = SupervisedNE(
my_dataset,
MyTorchModuleClass,
my_loss_function,
minibatch_size=...,
...
)
```
"""
def __init__(
self,
dataset: Dataset,
network: Union[str, nn.Module, Callable[[], nn.Module]],
loss_func: Optional[Callable] = None,
*,
network_args: Optional[dict] = None,
initial_bounds: Optional[BoundsPairLike] = (-0.00001, 0.00001),
minibatch_size: Optional[int] = None,
num_minibatches: Optional[int] = None,
num_actors: Optional[Union[int, str]] = "num_devices",
common_minibatch: bool = True,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
actor_config: Optional[dict] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
device: Optional[Device] = None,
):
"""
`__init__(...)`: Initialize the SupervisedNE.
Args:
dataset: The Dataset from which the minibatches will be pulled
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
loss_func: Optionally a function (or a Callable object) which
receives `y_hat` (the output generated by the neural network)
and `y` (the desired output), and returns the loss as a
scalar.
This argument can also be left as None, in which case it will
be expected that the method `_loss(self, y_hat, y)` is
overriden by the inheriting class.
network_args: Optionally a dict-like object, storing keyword
arguments to be passed to the network while instantiating it.
initial_bounds: Specifies an interval from which the values of the
initial neural network parameters will be drawn.
minibatch_size: Optionally an integer, describing the size of a
minibatch when pulling data from the dataset.
Can also be left as None, in which case it will be expected
that the inheriting class overrides the method
`_make_dataloader()` and defines how a new DataLoader is to be
made.
num_minibatches: An integer, specifying over how many minibatches
will a single neural network be evaluated.
If not specified, it will be assumed that the desired number
of minibatches per network evaluation is 1.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
common_minibatch: Whether or not the same minibatches will be
used when evaluating the solutions.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many sub-batches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a sub-batch (or sub-population) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
device: Default device in which a new population will be generated
and the neural networks will operate.
If not specified, "cpu" will be used.
"""
super().__init__(
objective_sense="min",
network=network,
network_args=network_args,
initial_bounds=initial_bounds,
num_actors=num_actors,
num_gpus_per_actor=num_gpus_per_actor,
actor_config=actor_config,
num_subbatches=num_subbatches,
subbatch_size=subbatch_size,
device=device,
)
self.dataset = dataset
self.dataloader: DataLoader = None
self._loss_func = loss_func
self._minibatch_size = None if minibatch_size is None else int(minibatch_size)
self._num_minibatches = 1 if num_minibatches is None else int(num_minibatches)
self._common_minibatch = common_minibatch
self._current_minibatches: Optional[list] = None
def _make_dataloader(self) -> DataLoader:
"""
Make a new DataLoader.
This method, in its default state, does not contain an implementation.
In the case where the `__init__` of `SupervisedNE` is not provided
with a minibatch size, it will be expected that this method is
overriden by the inheriting class and that the operation of creating
a new DataLoader is defined here.
Returns:
The new DataLoader.
"""
return NotImplementedError
def make_dataloader(self) -> DataLoader:
"""
Make a new DataLoader.
If the `__init__` of `SupervisedNE` was provided with a minibatch size
via the argument `minibatch_size`, then a new DataLoder will be made
with that minibatch size.
Otherwise, it will be expected that the method `_make_dataloader(...)`
was overriden to contain details regarding how the DataLoader should be
created, and that method will be executed.
Returns:
The created DataLoader.
"""
if self._minibatch_size is None:
return self._make_dataloader()
else:
return DataLoader(self.dataset, shuffle=True, batch_size=self._minibatch_size)
def _evaluate_using_minibatch(self, network: nn.Module, batch: Any) -> Union[float, torch.Tensor]:
"""
Pass a minibatch through a network, and compute the loss.
Args:
network: The network using which the loss will be computed.
batch: The minibatch that will be used as data.
Returns:
The loss.
"""
with torch.no_grad():
x, y = batch
yhat = network(x)
return self.loss(yhat, y)
def _loss(self, y_hat: Any, y: Any) -> Union[float, torch.Tensor]:
"""
The loss function.
This method, in its default state, does not contain an implementation.
In the case where `__init__` of `SupervisedNE` class was not given
a loss function via the argument `loss_func`, it will be expected
that this method is overriden by the inheriting class and that the
operation of computing the loss is defined here.
Args:
y_hat: The output estimated by the network
y: The desired output
Returns:
A scalar, representing the loss
"""
raise NotImplementedError
def loss(self, y_hat: Any, y: Any) -> Union[float, torch.Tensor]:
"""
Run the loss function and return the loss.
If the `__init__` of `SupervisedNE` class was given a loss
function via the argument `loss_func`, then that loss function
will be used. Otherwise, it will be expected that the method
`_loss(...)` is overriden with a loss definition, and that method
will be used to compute the loss.
The computed loss will be returned.
Args:
y_hat: The output estimated by the network
y: The desired output
Returns:
A scalar, representing the loss
"""
if self._loss_func is None:
return self._loss(y_hat, y)
else:
return self._loss_func(y_hat, y)
def _prepare(self) -> None:
self.dataloader = self.make_dataloader()
def get_minibatch(self) -> Any:
"""
Get the next minibatch from the DataLoader.
"""
if self.dataloader is None:
self._prepare()
try:
batch = next(self.dataloader_iterator)
if batch is None:
self.dataloader_iterator = iter(self.dataloader)
batch = self.get_minibatch()
else:
batch = batch
except Exception:
self.dataloader_iterator = iter(self.dataloader)
batch = self.get_minibatch()
# Move batch to device of network
return [var.to(self.network_device) for var in batch]
def _evaluate_network(self, network: nn.Module) -> torch.Tensor:
loss = 0.0
for batch_idx in range(self._num_minibatches):
if not self._common_minibatch:
self._current_minibatch = self.get_minibatch()
else:
self._current_minibatch = self._current_minibatches[batch_idx]
loss += self._evaluate_using_minibatch(network, self._current_minibatch) / self._num_minibatches
return loss
def _evaluate_batch(self, batch: SolutionBatch):
if self._common_minibatch:
# If using a common data batch, generate them now and use them for the entire batch of solutions
self._current_minibatches = [self.get_minibatch() for _ in range(self._num_minibatches)]
return super()._evaluate_batch(batch)
__init__(self, dataset, network, loss_func=None, *, network_args=None, initial_bounds=(-1e-05, 1e-05), minibatch_size=None, num_minibatches=None, num_actors='num_devices', common_minibatch=True, num_gpus_per_actor=None, actor_config=None, num_subbatches=None, subbatch_size=None, device=None)
special
¶
__init__(...)
: Initialize the SupervisedNE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
Dataset |
The Dataset from which the minibatches will be pulled |
required |
network |
Union[str, torch.nn.modules.module.Module, Callable[[], torch.nn.modules.module.Module]] |
A network structure string, or a Callable (which can be
a class inheriting from |
required |
loss_func |
Optional[Callable] |
Optionally a function (or a Callable object) which
receives |
None |
network_args |
Optional[dict] |
Optionally a dict-like object, storing keyword arguments to be passed to the network while instantiating it. |
None |
initial_bounds |
Union[Iterable[Union[float, Iterable[float], torch.Tensor]], evotorch.core.BoundsPair] |
Specifies an interval from which the values of the initial neural network parameters will be drawn. |
(-1e-05, 1e-05) |
minibatch_size |
Optional[int] |
Optionally an integer, describing the size of a
minibatch when pulling data from the dataset.
Can also be left as None, in which case it will be expected
that the inheriting class overrides the method
|
None |
num_minibatches |
Optional[int] |
An integer, specifying over how many minibatches will a single neural network be evaluated. If not specified, it will be assumed that the desired number of minibatches per network evaluation is 1. |
None |
num_actors |
Union[int, str] |
Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If |
'num_devices' |
common_minibatch |
bool |
Whether or not the same minibatches will be used when evaluating the solutions. |
True |
actor_config |
Optional[dict] |
A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass |
None |
num_gpus_per_actor |
Union[int, float, str] |
Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number |
None |
num_subbatches |
Optional[int] |
If |
None |
subbatch_size |
Optional[int] |
If |
None |
device |
Union[str, torch.device] |
Default device in which a new population will be generated and the neural networks will operate. If not specified, "cpu" will be used. |
None |
Source code in evotorch/neuroevolution/supervisedne.py
def __init__(
self,
dataset: Dataset,
network: Union[str, nn.Module, Callable[[], nn.Module]],
loss_func: Optional[Callable] = None,
*,
network_args: Optional[dict] = None,
initial_bounds: Optional[BoundsPairLike] = (-0.00001, 0.00001),
minibatch_size: Optional[int] = None,
num_minibatches: Optional[int] = None,
num_actors: Optional[Union[int, str]] = "num_devices",
common_minibatch: bool = True,
num_gpus_per_actor: Optional[Union[int, float, str]] = None,
actor_config: Optional[dict] = None,
num_subbatches: Optional[int] = None,
subbatch_size: Optional[int] = None,
device: Optional[Device] = None,
):
"""
`__init__(...)`: Initialize the SupervisedNE.
Args:
dataset: The Dataset from which the minibatches will be pulled
network: A network structure string, or a Callable (which can be
a class inheriting from `torch.nn.Module`, or a function
which returns a `torch.nn.Module` instance), or an instance
of `torch.nn.Module`.
The object provided here determines the structure of the
neural network whose parameters will be evolved.
A network structure string is a string which can be processed
by `evotorch.neuroevolution.net.str_to_net(...)`.
Please see the documentation of the function
`evotorch.neuroevolution.net.str_to_net(...)` to see how such
a neural network structure string looks like.
loss_func: Optionally a function (or a Callable object) which
receives `y_hat` (the output generated by the neural network)
and `y` (the desired output), and returns the loss as a
scalar.
This argument can also be left as None, in which case it will
be expected that the method `_loss(self, y_hat, y)` is
overriden by the inheriting class.
network_args: Optionally a dict-like object, storing keyword
arguments to be passed to the network while instantiating it.
initial_bounds: Specifies an interval from which the values of the
initial neural network parameters will be drawn.
minibatch_size: Optionally an integer, describing the size of a
minibatch when pulling data from the dataset.
Can also be left as None, in which case it will be expected
that the inheriting class overrides the method
`_make_dataloader()` and defines how a new DataLoader is to be
made.
num_minibatches: An integer, specifying over how many minibatches
will a single neural network be evaluated.
If not specified, it will be assumed that the desired number
of minibatches per network evaluation is 1.
num_actors: Number of actors to create for parallelized
evaluation of the solutions.
Certain string values are also accepted.
When given as "max" or as "num_cpus", the number of actors
will be equal to the number of all available CPUs in the ray
cluster.
When given as "num_gpus", the number of actors will be
equal to the number of all available GPUs in the ray
cluster, and each actor will be assigned a GPU.
When given as "num_devices", the number of actors will be
equal to the minimum among the number of CPUs and the number
of GPUs available in the cluster (or will be equal to the
number of CPUs if there is no GPU), and each actor will be
assigned a GPU (if available).
If `num_actors` is given as "num_gpus" or "num_devices",
the argument `num_gpus_per_actor` must not be used,
and the `actor_config` dictionary must not contain the
key "num_gpus".
If `num_actors` is given as something other than "num_gpus"
or "num_devices", and if you wish to assign GPUs to each
actor, then please see the argument `num_gpus_per_actor`.
common_minibatch: Whether or not the same minibatches will be
used when evaluating the solutions.
actor_config: A dictionary, representing the keyword arguments
to be passed to the options(...) used when creating the
ray actor objects. To be used for explicitly allocating
resources per each actor.
For example, for declaring that each actor is to use a GPU,
one can pass `actor_config=dict(num_gpus=1)`.
Can also be given as None (which is the default),
if no such options are to be passed.
num_gpus_per_actor: Number of GPUs to be allocated by each
remote actor.
The default behavior is to NOT allocate any GPU at all
(which is the default behavior of the ray library as well).
When given as a number `n`, each actor will be given
`n` GPUs (where `n` can be an integer, or can be a `float`
for fractional allocation).
When given as a string "max", then the available GPUs
across the entire ray cluster (or within the local computer
in the simplest cases) will be equally distributed among
the actors.
When given as a string "all", then each actor will have
access to all the GPUs (this will be achieved by suppressing
the environment variable `CUDA_VISIBLE_DEVICES` for each
actor).
When the problem is not distributed (i.e. when there are
no actors), this argument is expected to be left as None.
num_subbatches: If `num_subbatches` is None (assuming that
`subbatch_size` is also None), then, when evaluating a
population, the population will be split into n pieces, `n`
being the number of actors, and each actor will evaluate
its assigned piece. If `num_subbatches` is an integer `m`,
then the population will be split into `m` pieces,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
how many sub-batches will be generated, and therefore,
how many gradients will be computed by the remote actors.
subbatch_size: If `subbatch_size` is None (assuming that
`num_subbatches` is also None), then, when evaluating a
population, the population will be split into `n` pieces, `n`
being the number of actors, and each actor will evaluate its
assigned piece. If `subbatch_size` is an integer `m`,
then the population will be split into pieces of size `m`,
and actors will continually accept the next unevaluated
piece as they finish their current tasks.
When there can be significant difference across the solutions
in terms of computational requirements, specifying a
`subbatch_size` can be beneficial, because, while one
actor is busy with a subbatch containing computationally
challenging solutions, other actors can accept more
tasks and save time.
The arguments `num_subbatches` and `subbatch_size` cannot
be given values other than None at the same time.
While using a distributed algorithm, this argument determines
the size of a sub-batch (or sub-population) sampled by a
remote actor for computing a gradient.
In distributed mode, it is expected that the population size
is divisible by `subbatch_size`.
device: Default device in which a new population will be generated
and the neural networks will operate.
If not specified, "cpu" will be used.
"""
super().__init__(
objective_sense="min",
network=network,
network_args=network_args,
initial_bounds=initial_bounds,
num_actors=num_actors,
num_gpus_per_actor=num_gpus_per_actor,
actor_config=actor_config,
num_subbatches=num_subbatches,
subbatch_size=subbatch_size,
device=device,
)
self.dataset = dataset
self.dataloader: DataLoader = None
self._loss_func = loss_func
self._minibatch_size = None if minibatch_size is None else int(minibatch_size)
self._num_minibatches = 1 if num_minibatches is None else int(num_minibatches)
self._common_minibatch = common_minibatch
self._current_minibatches: Optional[list] = None
get_minibatch(self)
¶
Get the next minibatch from the DataLoader.
Source code in evotorch/neuroevolution/supervisedne.py
def get_minibatch(self) -> Any:
"""
Get the next minibatch from the DataLoader.
"""
if self.dataloader is None:
self._prepare()
try:
batch = next(self.dataloader_iterator)
if batch is None:
self.dataloader_iterator = iter(self.dataloader)
batch = self.get_minibatch()
else:
batch = batch
except Exception:
self.dataloader_iterator = iter(self.dataloader)
batch = self.get_minibatch()
# Move batch to device of network
return [var.to(self.network_device) for var in batch]
loss(self, y_hat, y)
¶
Run the loss function and return the loss.
If the __init__
of SupervisedNE
class was given a loss
function via the argument loss_func
, then that loss function
will be used. Otherwise, it will be expected that the method
_loss(...)
is overriden with a loss definition, and that method
will be used to compute the loss.
The computed loss will be returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y_hat |
Any |
The output estimated by the network |
required |
y |
Any |
The desired output |
required |
Returns:
Type | Description |
---|---|
Union[float, torch.Tensor] |
A scalar, representing the loss |
Source code in evotorch/neuroevolution/supervisedne.py
def loss(self, y_hat: Any, y: Any) -> Union[float, torch.Tensor]:
"""
Run the loss function and return the loss.
If the `__init__` of `SupervisedNE` class was given a loss
function via the argument `loss_func`, then that loss function
will be used. Otherwise, it will be expected that the method
`_loss(...)` is overriden with a loss definition, and that method
will be used to compute the loss.
The computed loss will be returned.
Args:
y_hat: The output estimated by the network
y: The desired output
Returns:
A scalar, representing the loss
"""
if self._loss_func is None:
return self._loss(y_hat, y)
else:
return self._loss_func(y_hat, y)
make_dataloader(self)
¶
Make a new DataLoader.
If the __init__
of SupervisedNE
was provided with a minibatch size
via the argument minibatch_size
, then a new DataLoder will be made
with that minibatch size.
Otherwise, it will be expected that the method _make_dataloader(...)
was overriden to contain details regarding how the DataLoader should be
created, and that method will be executed.
Returns:
Type | Description |
---|---|
DataLoader |
The created DataLoader. |
Source code in evotorch/neuroevolution/supervisedne.py
def make_dataloader(self) -> DataLoader:
"""
Make a new DataLoader.
If the `__init__` of `SupervisedNE` was provided with a minibatch size
via the argument `minibatch_size`, then a new DataLoder will be made
with that minibatch size.
Otherwise, it will be expected that the method `_make_dataloader(...)`
was overriden to contain details regarding how the DataLoader should be
created, and that method will be executed.
Returns:
The created DataLoader.
"""
if self._minibatch_size is None:
return self._make_dataloader()
else:
return DataLoader(self.dataset, shuffle=True, batch_size=self._minibatch_size)
operators
special
¶
This module provides various common operators to be used within evolutionary algorithms.
Each operator is provided as a separate class, which is to be instantiated in this form:
op = OperatorName(
problem, # where problem is a Problem instance
hyperparameter1=...,
hyperparameter2=...,
...
)
Each operator has its __call__(...)
method overriden
so that it can be used like a function.
For example, if the operator op
instantiated above
were a mutation operator, it would be used like this:
# Apply mutation on a SolutionBatch
mutated_solution = op(my_solution_batch)
Please see the documentations of the provided operator classes for details about how to instantiate them, and how to call them.
A common usage for the operators provided here is to
use them with a genetic algorithm.
More specifically, the SteadyStateGA algorithm provided
within the namespace evotorch.algorithms
needs
to be configured so that it knows which cross-over operator
and which mutation operator it should apply on the
solutions. The way this is done is as follows:
import evotorch.algorithms as dra
import evotorch.operators as dro
problem = ... # initialize the Problem
ga = dra.SteadyStateGA(problem, popsize=...)
# Configure the genetic algorithm to use
# simulated binary cross-over
ga.use(
dro.SimulatedBinaryCrossOver(
problem,
tournament_size=...,
cross_over_rate=...,
eta=...
)
)
# Configure the genetic algorithm to use
# Gaussian mutation
ga.use(
dro.GaussianMutation(
problem,
stdev=...
)
)
base
¶
Base classes for various operators
CopyingOperator (Operator)
¶
Base class for operators which do not do in-place modifications.
This class does not add any functionality to the Operator class.
Instead, the annotations of the __call__(...)
method is
updated so that it makes it clear that a new SolutionBatch is
returned.
One is expected to override the definition of the method _do(...)
in an inheriting subclass to define a custom CopyingOperator
.
From outside, a subclass of CopyingOperator
is meant to be called like
a function, as follows:
my_new_batch = my_copying_operator_instance(my_batch)
Source code in evotorch/operators/base.py
class CopyingOperator(Operator):
"""
Base class for operators which do not do in-place modifications.
This class does not add any functionality to the Operator class.
Instead, the annotations of the `__call__(...)` method is
updated so that it makes it clear that a new SolutionBatch is
returned.
One is expected to override the definition of the method `_do(...)`
in an inheriting subclass to define a custom `CopyingOperator`.
From outside, a subclass of `CopyingOperator` is meant to be called like
a function, as follows:
my_new_batch = my_copying_operator_instance(my_batch)
"""
def __init__(self, problem: Problem):
"""
`__init__(...)`: Initialize the CopyingOperator.
Args:
problem: The problem object which is being worked on.
"""
super().__init__(problem)
def __call__(self, batch: SolutionBatch) -> SolutionBatch:
return self._do(batch)
def _do(self, batch: SolutionBatch) -> SolutionBatch:
"""The actual definition of the operation on the batch.
Expected to be overriden by a subclass.
"""
raise NotImplementedError
__init__(self, problem)
special
¶
__init__(...)
: Initialize the CopyingOperator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
CrossOver (CopyingOperator)
¶
Base class for any CrossOver operator.
One is expected to override the definition of the method
_do_cross_over(...)
in an inheriting subclass to define a
custom CrossOver
.
From outside, a CrossOver
instance is meant to be called like this:
child_solution_batch = my_cross_over_instance(population_batch)
which causes the CrossOver
instance to select parents from the
population_batch
, recombine their values according to what is
instructed in _do_cross_over(...)
, and return the newly made solutions
in a SolutionBatch
.
Source code in evotorch/operators/base.py
class CrossOver(CopyingOperator):
"""
Base class for any CrossOver operator.
One is expected to override the definition of the method
`_do_cross_over(...)` in an inheriting subclass to define a
custom `CrossOver`.
From outside, a `CrossOver` instance is meant to be called like this:
child_solution_batch = my_cross_over_instance(population_batch)
which causes the `CrossOver` instance to select parents from the
`population_batch`, recombine their values according to what is
instructed in `_do_cross_over(...)`, and return the newly made solutions
in a `SolutionBatch`.
"""
def __init__(
self,
problem: Problem,
*,
tournament_size: int,
obj_index: Optional[int] = None,
num_children: Optional[int] = None,
cross_over_rate: Optional[float] = None,
):
"""
`__init__(...)`: Initialize the CrossOver.
Args:
problem: The problem object which is being worked on.
tournament_size: Size of the tournament which will be used for
doing selection.
obj_index: Index of the objective according to which the selection
will be done.
If `obj_index` is None and the problem is single-objective,
then the selection will be done according to that single
objective.
If `obj_index` is None and the problem is multi-objective,
then the selection will be done according to pareto-dominance
and crowding criteria, as done in NSGA-II.
If `obj_index` is an integer `i`, then the selection will be
done according to the i-th objective only, even when the
problem is multi-objective.
num_children: How many children to generate.
Expected as an even number.
Cannot be used together with `cross_over_rate`.
cross_over_rate: Rate of the cross-over operations in comparison
with the population size.
1.0 means that the number of generated children will be equal
to the original population size.
Cannot be used together with `num_children`.
"""
super().__init__(problem)
self._obj_index = None if obj_index is None else problem.normalize_obj_index(obj_index)
self._tournament_size = int(tournament_size)
if num_children is not None and cross_over_rate is not None:
raise ValueError(
"Received both `num_children` and `cross_over_rate` as values other than None."
" It was expected to receive both of them as None, or one of them as None,"
" but not both of them as values other than None."
)
self._num_children = None if num_children is None else int(num_children)
self._cross_over_rate = None if cross_over_rate is None else float(cross_over_rate)
def _compute_num_tournaments(self, batch: SolutionBatch) -> int:
if self._num_children is None and self._cross_over_rate is None:
# return len(batch) * 2
result = len(batch)
if (result % 2) != 0:
result += 1
return result
elif self._num_children is not None:
if (self._num_children % 2) != 0:
raise ValueError(
f"The initialization argument `num_children` was expected as an even number."
f" However, it was found as an odd number: {self._num_children}"
)
return self._num_children
elif self._cross_over_rate is not None:
f = len(batch) * self._cross_over_rate
result1 = math.ceil(f)
result2 = math.floor(f)
if result1 == result2:
result = result1
if (result % 2) != 0:
result += 1
else:
if (result1 % 2) == 0:
result = result1
else:
result = result2
return result
else:
assert False, "Exection should not have reached this point"
@property
def obj_index(self) -> Optional[int]:
"""The objective index according to which the selection will be done"""
return self._obj_index
@torch.no_grad()
def _do_tournament(self, batch: SolutionBatch) -> tuple:
# Compute the required number of tournaments
num_tournaments = self._compute_num_tournaments(batch)
if self._problem.is_multi_objective and self._obj_index is None:
# If the problem is multi-objective, and an objective index is not specified,
# then we do a multi-objective-specific cross-over
# At first, pareto-sort the solutions
fronts, ranks = batch.arg_pareto_sort()
# In NSGA-II-inspired pareto-sorting, smallest rank means the best front.
# Right now, we want the opposite: we want the solutions in the best front
# to have rank values which are numerically highest.
# The following line re-arranges the rank values such that the solutions
# in the best front have their ranks equal to len(fronts), and the ones
# in the worst front have their ranks equal to 1.
ranks = torch.as_tensor(len(fronts) - ranks, dtype=self._problem.eval_dtype, device=batch.device)
# Because the ranks are computed front the fronts indices, we expect many
# solutions to end up with the same rank values.
# To ensure that a randomized selection will be made when comparing two
# solutions with the same rank, we add random noise to the ranks
# (between 0.0 and 0.1).
ranks += self._problem.make_uniform(len(batch), dtype=self._problem.eval_dtype, device=batch.device) * 0.1
else:
# Rank the solutions. Worst gets -0.5, best gets 0.5
ranks = batch.utility(self._obj_index, ranking_method="centered")
# Get the internal values tensor of the solution batch
indata = batch._data
# Get a tensor of random integers in the shape (num_tournaments, tournament_size)
tournament_indices = self.problem.make_randint(
(num_tournaments, self._tournament_size), n=len(batch), device=indata.device
)
tournament_ranks = ranks[tournament_indices]
# Imagine tournament size is 2, and the solutions are [ worst, bad, best, good ].
# So, what we have is (0.2s are actually 0.166666...):
#
# ranks = [ -0.5, -0.2, 0.5, 0.2 ]
#
# tournament tournament
# indices ranks
#
# 0, 1 -0.5, -0.2
# 2, 3 0.5, 0.2
# 1, 0 -0.2, -0.5
# 3, 2 0.2, 0.5
# 1, 2 -0.2, 0.5
# 0, 3 -0.5, 0.2
# 2, 0 0.5, -0.5
# 3, 1 0.2, -0.2
#
# According to tournament_indices, there are 8 tournaments.
# In tournament 0 (topmost row), parent0 and parent1 compete.
# In tournament 1 (next row), parent2 and parent3 compete; and so on.
# tournament_ranks tells us:
# In tournament 0, left-candidate has rank -0.5, and right-candidate has -0.2.
# In tournament 1, left-candidate has rank 0.5, and right-candidate has 0.2; and so on.
tournament_rows = torch.arange(0, num_tournaments, device=indata.device)
parents = tournament_indices[tournament_rows, torch.argmax(tournament_ranks, dim=-1)]
# Continuing from the [ worst, bad, best, good ] example, we end up with:
#
# T T
# tournament tournament tournament argmax parents
# rows indices ranks dim=-1
#
# 0 0, 1 -0.5, -0.2 1 1
# 1 2, 3 0.5, 0.2 0 2
# 2 1, 0 -0.2, -0.5 0 1
# 3 3, 2 0.2, 0.5 1 2
# 4 1, 2 -0.2, 0.5 1 2
# 5 0, 3 -0.5, 0.2 1 3
# 6 2, 0 0.5, -0.5 0 2
# 7 3, 1 0.2, -0.2 0 3
#
# where tournament_rows represents row indices in tournament_indices tensor (from 0 to 7).
# argmax() tells us who won the competition (0: left-candidate won, 1: right-candidate won).
#
# tournament_rows and argmax() together give us the row and column of the winner in tensor
# tournament_indices, which in turn gives us the index of the winner solution in the batch.
# We split the parents array from the middle
split_point = int(len(parents) / 2)
parents1 = indata[parents][:split_point]
parents2 = indata[parents][split_point:]
# We now have:
#
# parents1 parents2
# =============== ===============
# values of sln 1 values of sln 2 (solution1 is to generate a child with solution2)
# values of sln 2 values of sln 3 (solution2 is to generate a child with solution3)
# values of sln 1 values of sln 2 (solution1 is to generate another child with solution2)
# values of sln 2 values of sln 3 (solution2 is to generate another child with solution3)
#
# With this, the tournament selection phase is over.
return parents1, parents2
def _do_cross_over(
self,
batch: SolutionBatch,
parents1: Union[torch.Tensor, ObjectArray],
parents2: Union[torch.Tensor, ObjectArray],
) -> SolutionBatch:
"""
The actual definition of the cross-over operation.
This is a protected method, meant to be overriden by the inheriting
subclass.
The arguments passed to this function are the original population
as a batch, and the decision values of the first and the second half
of the selected parents, both as PyTorch tensors or as
`ObjectArray`s.
In the overriding function, for each integer i, one is expected to
recombine the values of the i-th row of `parents1` with the values of
the i-th row of `parents2` twice (twice because each pairing is
expected to generate two children).
After that, one is expected to generate a SolutionBatch and place
all the recombination results into the values of that new batch.
Args:
batch: The original population, as a SolutionBatch.
parents1: The decision values of the first half of the
selected parents.
parents2: The decision values of the second half of the
selected parents.
Returns:
A new SolutionBatch which contains the recombination
of the parents.
"""
raise NotImplementedError
def _make_children_batch(self, child_values: Union[torch.Tensor, ObjectArray]) -> SolutionBatch:
result = SolutionBatch(self.problem, device=child_values.device, empty=True, popsize=child_values.shape[0])
result._data = child_values
return result
def _do(self, batch: SolutionBatch) -> SolutionBatch:
parents1, parents2 = self._do_tournament(batch)
if len(parents1) != len(parents2):
raise ValueError(
f"_do_tournament() returned parents1 and parents2 with incompatible sizes. "
f"len(parents1): {len(parents1)}; len(parents2): {len(parents2)}."
)
return self._do_cross_over(parents1, parents2)
obj_index: Optional[int]
property
readonly
¶
The objective index according to which the selection will be done
__init__(self, problem, *, tournament_size, obj_index=None, num_children=None, cross_over_rate=None)
special
¶
__init__(...)
: Initialize the CrossOver.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
tournament_size |
int |
Size of the tournament which will be used for doing selection. |
required |
obj_index |
Optional[int] |
Index of the objective according to which the selection
will be done.
If |
None |
num_children |
Optional[int] |
How many children to generate.
Expected as an even number.
Cannot be used together with |
None |
cross_over_rate |
Optional[float] |
Rate of the cross-over operations in comparison
with the population size.
1.0 means that the number of generated children will be equal
to the original population size.
Cannot be used together with |
None |
Source code in evotorch/operators/base.py
def __init__(
self,
problem: Problem,
*,
tournament_size: int,
obj_index: Optional[int] = None,
num_children: Optional[int] = None,
cross_over_rate: Optional[float] = None,
):
"""
`__init__(...)`: Initialize the CrossOver.
Args:
problem: The problem object which is being worked on.
tournament_size: Size of the tournament which will be used for
doing selection.
obj_index: Index of the objective according to which the selection
will be done.
If `obj_index` is None and the problem is single-objective,
then the selection will be done according to that single
objective.
If `obj_index` is None and the problem is multi-objective,
then the selection will be done according to pareto-dominance
and crowding criteria, as done in NSGA-II.
If `obj_index` is an integer `i`, then the selection will be
done according to the i-th objective only, even when the
problem is multi-objective.
num_children: How many children to generate.
Expected as an even number.
Cannot be used together with `cross_over_rate`.
cross_over_rate: Rate of the cross-over operations in comparison
with the population size.
1.0 means that the number of generated children will be equal
to the original population size.
Cannot be used together with `num_children`.
"""
super().__init__(problem)
self._obj_index = None if obj_index is None else problem.normalize_obj_index(obj_index)
self._tournament_size = int(tournament_size)
if num_children is not None and cross_over_rate is not None:
raise ValueError(
"Received both `num_children` and `cross_over_rate` as values other than None."
" It was expected to receive both of them as None, or one of them as None,"
" but not both of them as values other than None."
)
self._num_children = None if num_children is None else int(num_children)
self._cross_over_rate = None if cross_over_rate is None else float(cross_over_rate)
Operator
¶
Base class for various operations on SolutionBatch objects.
Some subclasses of Operator may be operating on the batches in-place, while some others may generate new batches, leaving the original batches untouched.
One is expected to override the definition of the method _do(...)
in an inheriting subclass to define a custom Operator
.
From outside, a subclass of Operator is meant to be called like a function. In more details, operators which apply in-place modifications are meant to be called like this:
my_operator_instance(my_batch)
Operators which return a new batch are meant to be called like this:
my_new_batch = my_operator_instance(my_batch)
Source code in evotorch/operators/base.py
class Operator:
"""Base class for various operations on SolutionBatch objects.
Some subclasses of Operator may be operating on the batches in-place,
while some others may generate new batches, leaving the original batches
untouched.
One is expected to override the definition of the method `_do(...)`
in an inheriting subclass to define a custom `Operator`.
From outside, a subclass of Operator is meant to be called like
a function. In more details, operators which apply in-place modifications
are meant to be called like this:
my_operator_instance(my_batch)
Operators which return a new batch are meant to be called like this:
my_new_batch = my_operator_instance(my_batch)
"""
def __init__(self, problem: Problem):
"""
`__init__(...)`: Initialize the Operator.
Args:
problem: The problem object which is being worked on.
"""
if not isinstance(problem, Problem):
raise TypeError(f"Expected a Problem object, but received {repr(problem)}")
self._problem = problem
self._lb = clone(self._problem.lower_bounds)
self._ub = clone(self._problem.upper_bounds)
@property
def problem(self) -> Problem:
"""Get the problem to which this cross-over operator is bound"""
return self._problem
@property
def dtype(self) -> DType:
"""Get the dtype of the bound problem.
If the problem does not work with SolutionVectors and
therefore it does not have a dtype, None is returned.
"""
return self.problem.dtype
@torch.no_grad()
def _respect_bounds(self, x: torch.Tensor) -> torch.Tensor:
"""
Make sure that a given PyTorch tensor respects the problem's bounds.
This is a protected method which might be used by the
inheriting subclasses to ensure that the result of their
various operations are clipped properly to respect the
boundaries set by the problem object.
Note that this function might return the tensor itself
is the problem is not bounded.
Args:
x: The PyTorch tensor to be clipped.
Returns:
The clipped tensor.
"""
if self._lb is not None:
self._lb = torch.as_tensor(self._lb, dtype=x.dtype, device=x.device)
x = torch.max(self._lb, x)
if self._ub is not None:
self._ub = torch.as_tensor(self._ub, dtype=x.dtype, device=x.device)
x = torch.min(self._ub, x)
return x
def __call__(self, batch: SolutionBatch):
"""
Apply the operator on the given batch.
"""
if not isinstance(batch, SolutionBatch):
raise TypeError(
f"The operation {self.__class__.__name__} can only work on"
f" SolutionBatch objects, but it received an object of type"
f" {repr(type(batch))}."
)
self._do(batch)
def _do(self, batch: SolutionBatch):
"""
The actual definition of the operation on the batch.
Expected to be overriden by a subclass.
"""
raise NotImplementedError
dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
Get the dtype of the bound problem. If the problem does not work with SolutionVectors and therefore it does not have a dtype, None is returned.
problem: Problem
property
readonly
¶
Get the problem to which this cross-over operator is bound
__call__(self, batch)
special
¶
Apply the operator on the given batch.
Source code in evotorch/operators/base.py
def __call__(self, batch: SolutionBatch):
"""
Apply the operator on the given batch.
"""
if not isinstance(batch, SolutionBatch):
raise TypeError(
f"The operation {self.__class__.__name__} can only work on"
f" SolutionBatch objects, but it received an object of type"
f" {repr(type(batch))}."
)
self._do(batch)
__init__(self, problem)
special
¶
__init__(...)
: Initialize the Operator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
Source code in evotorch/operators/base.py
def __init__(self, problem: Problem):
"""
`__init__(...)`: Initialize the Operator.
Args:
problem: The problem object which is being worked on.
"""
if not isinstance(problem, Problem):
raise TypeError(f"Expected a Problem object, but received {repr(problem)}")
self._problem = problem
self._lb = clone(self._problem.lower_bounds)
self._ub = clone(self._problem.upper_bounds)
SingleObjOperator (Operator)
¶
Base class for all the operators which focus on only one objective.
One is expected to override the definition of the method _do(...)
in an inheriting subclass to define a custom SingleObjOperator
.
Source code in evotorch/operators/base.py
class SingleObjOperator(Operator):
"""
Base class for all the operators which focus on only one objective.
One is expected to override the definition of the method `_do(...)`
in an inheriting subclass to define a custom `SingleObjOperator`.
"""
def __init__(self, problem: Problem, obj_index: Optional[int] = None):
"""
Initialize the SingleObjOperator.
Args:
problem: The problem object which is being worked on.
obj_index: Index of the objective to focus on.
Can be given as None if the problem is single-objective.
"""
super().__init__(problem)
self._obj_index: int = problem.normalize_obj_index(obj_index)
@property
def obj_index(self) -> int:
"""Index of the objective on which this operator is to be applied"""
return self._obj_index
obj_index: int
property
readonly
¶
Index of the objective on which this operator is to be applied
__init__(self, problem, obj_index=None)
special
¶
Initialize the SingleObjOperator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object which is being worked on. |
required |
obj_index |
Optional[int] |
Index of the objective to focus on. Can be given as None if the problem is single-objective. |
None |
Source code in evotorch/operators/base.py
def __init__(self, problem: Problem, obj_index: Optional[int] = None):
"""
Initialize the SingleObjOperator.
Args:
problem: The problem object which is being worked on.
obj_index: Index of the objective to focus on.
Can be given as None if the problem is single-objective.
"""
super().__init__(problem)
self._obj_index: int = problem.normalize_obj_index(obj_index)
real
¶
This module contains operators defined to work with problems
whose dtype
s are real numbers (e.g. torch.float32
).
CosynePermutation (CopyingOperator)
¶
Representation of permutation operation on a SolutionBatch.
For each decision variable index, a permutation operation across all or a subset of solutions, is performed. The result is returned on a new SolutionBatch. The original SolutionBatch remains unmodified.
Reference:
F.Gomez, J.Schmidhuber, R.Miikkulainen (2008).
Accelerated Neural Evolution through Cooperatively Coevolved Synapses
Journal of Machine Learning Research 9, 937-965
Source code in evotorch/operators/real.py
class CosynePermutation(CopyingOperator):
"""
Representation of permutation operation on a SolutionBatch.
For each decision variable index, a permutation operation across
all or a subset of solutions, is performed.
The result is returned on a new SolutionBatch.
The original SolutionBatch remains unmodified.
Reference:
F.Gomez, J.Schmidhuber, R.Miikkulainen (2008).
Accelerated Neural Evolution through Cooperatively Coevolved Synapses
Journal of Machine Learning Research 9, 937-965
"""
def __init__(self, problem: Problem, obj_index: Optional[int] = None, *, permute_all: bool = False):
"""
`__init__(...)`: Initialize the CosynePermutation.
Args:
problem: The problem object to work on.
obj_index: The index of the objective according to which the
candidates for permutation will be selected.
Can be left as None if the problem is single-objective,
or if `permute_all` is given as True (in which case there
will be no candidate selection as the entire population will
be subject to permutation).
permute_all: Whether or not to apply permutation on the entire
population, instead of using a selective permutation.
"""
if permute_all:
if obj_index is not None:
raise ValueError(
"When `permute_all` is given as True (which seems to be the case)"
" `obj_index` is expected as None,"
" because the operator is independent of any objective and any fitness in this mode."
" However, `permute_all` was found to be something other than None."
)
self._obj_index = None
else:
self._obj_index = problem.normalize_obj_index(obj_index)
super().__init__(problem)
self._permute_all = bool(permute_all)
@property
def obj_index(self) -> Optional[int]:
"""Objective index according to which the operator will run.
If `permute_all` was given as True, objectives are irrelevant, in which case
`obj_index` is returned as None.
If `permute_all` was given as False, the relevant `obj_index` is provided
as an integer.
"""
return self._obj_index
@torch.no_grad()
def _do(self, batch: SolutionBatch) -> SolutionBatch:
indata = batch._data
if not self._permute_all:
n = batch.solution_length
ranks = batch.utility(self._obj_index, ranking_method="centered")
# fitnesses = batch.evals[:, self._obj_index].clone().reshape(-1)
# ranks = rank(
# fitnesses, ranking_method="centered", higher_is_better=(self.problem.senses[self.obj_index] == "max")
# )
prob_permute = (1 - (ranks + 0.5).pow(1 / float(n))).unsqueeze(1).expand(len(batch), batch.solution_length)
else:
prob_permute = torch.ones_like(indata)
perm_mask = self.problem.make_uniform_shaped_like(prob_permute) <= prob_permute
perm_mask_sorted = torch.sort(perm_mask.to(torch.long), descending=True, dim=0)[0].to(
torch.bool
) # Sort permutations
perm_rand = self.problem.make_uniform_shaped_like(prob_permute)
perm_rand[torch.logical_not(perm_mask)] = 1.0
permutations = torch.argsort(perm_rand, dim=0) # Generate permutations
perm_sort = (
torch.arange(0, perm_mask.shape[0], device=indata.device).unsqueeze(-1).repeat(1, perm_mask.shape[1])
)
perm_sort[torch.logical_not(perm_mask)] += perm_mask.shape[0] + 1
perm_sort = torch.sort(perm_sort, dim=0)[0] # Generate the origin of permutations
_, permutation_columns = torch.nonzero(perm_mask_sorted, as_tuple=True)
permutation_origin_indices = perm_sort[perm_mask_sorted]
permutation_target_indices = permutations[perm_mask_sorted]
newbatch = SolutionBatch(like=batch, empty=True)
newdata = newbatch._data
newdata[:] = indata[:]
newdata[permutation_origin_indices, permutation_columns] = newdata[
permutation_target_indices, permutation_columns
]
return newbatch
obj_index: Optional[int]
property
readonly
¶
Objective index according to which the operator will run.
If permute_all
was given as True, objectives are irrelevant, in which case
obj_index
is returned as None.
If permute_all
was given as False, the relevant obj_index
is provided
as an integer.
__init__(self, problem, obj_index=None, *, permute_all=False)
special
¶
__init__(...)
: Initialize the CosynePermutation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object to work on. |
required |
obj_index |
Optional[int] |
The index of the objective according to which the
candidates for permutation will be selected.
Can be left as None if the problem is single-objective,
or if |
None |
permute_all |
bool |
Whether or not to apply permutation on the entire population, instead of using a selective permutation. |
False |
Source code in evotorch/operators/real.py
def __init__(self, problem: Problem, obj_index: Optional[int] = None, *, permute_all: bool = False):
"""
`__init__(...)`: Initialize the CosynePermutation.
Args:
problem: The problem object to work on.
obj_index: The index of the objective according to which the
candidates for permutation will be selected.
Can be left as None if the problem is single-objective,
or if `permute_all` is given as True (in which case there
will be no candidate selection as the entire population will
be subject to permutation).
permute_all: Whether or not to apply permutation on the entire
population, instead of using a selective permutation.
"""
if permute_all:
if obj_index is not None:
raise ValueError(
"When `permute_all` is given as True (which seems to be the case)"
" `obj_index` is expected as None,"
" because the operator is independent of any objective and any fitness in this mode."
" However, `permute_all` was found to be something other than None."
)
self._obj_index = None
else:
self._obj_index = problem.normalize_obj_index(obj_index)
super().__init__(problem)
self._permute_all = bool(permute_all)
GaussianMutation (CopyingOperator)
¶
Gaussian mutation operator.
Follows the algorithm description in:
Sean Luke, 2013, Essentials of Metaheuristics, Lulu, second edition
available for free at http://cs.gmu.edu/~sean/book/metaheuristics/
Source code in evotorch/operators/real.py
class GaussianMutation(CopyingOperator):
"""
Gaussian mutation operator.
Follows the algorithm description in:
Sean Luke, 2013, Essentials of Metaheuristics, Lulu, second edition
available for free at http://cs.gmu.edu/~sean/book/metaheuristics/
"""
def __init__(self, problem: Problem, *, stdev: float, mutation_probability: Optional[float] = 1.0):
"""
`__init__(...)`: Initialize the GaussianMutation.
Args:
problem: The problem object to work with.
stdev: The standard deviation of the Gaussian noise to apply on
each decision variable.
mutation_probability: The probability of mutation, for each
decision variable.
By default, the value of this argument is 1.0, which means
that all of the decision variables will be affected by the
mutation.
"""
super().__init__(problem)
self._mutation_probability = float(mutation_probability)
self._stdev = float(stdev)
@torch.no_grad()
def _do(self, batch: SolutionBatch) -> SolutionBatch:
result = deepcopy(batch)
data = result.access_values()
mutation_matrix = self.problem.make_uniform_shaped_like(data) <= self._mutation_probability
data[mutation_matrix] += self._stdev * self.problem.make_gaussian_shaped_like(data[mutation_matrix])
data[:] = self._respect_bounds(data)
return result
__init__(self, problem, *, stdev, mutation_probability=1.0)
special
¶
__init__(...)
: Initialize the GaussianMutation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object to work with. |
required |
stdev |
float |
The standard deviation of the Gaussian noise to apply on each decision variable. |
required |
mutation_probability |
Optional[float] |
The probability of mutation, for each decision variable. By default, the value of this argument is 1.0, which means that all of the decision variables will be affected by the mutation. |
1.0 |
Source code in evotorch/operators/real.py
def __init__(self, problem: Problem, *, stdev: float, mutation_probability: Optional[float] = 1.0):
"""
`__init__(...)`: Initialize the GaussianMutation.
Args:
problem: The problem object to work with.
stdev: The standard deviation of the Gaussian noise to apply on
each decision variable.
mutation_probability: The probability of mutation, for each
decision variable.
By default, the value of this argument is 1.0, which means
that all of the decision variables will be affected by the
mutation.
"""
super().__init__(problem)
self._mutation_probability = float(mutation_probability)
self._stdev = float(stdev)
OnePointCrossOver (CrossOver)
¶
Representation of a one-point cross-over operator.
When this operator is applied on a SolutionBatch, a tournament selection technique is used for selecting parent solutions from the batch, and then those parent solutions are mated via cutting from a random position and recombining. The result of these recombination operations is a new SolutionBatch, containing the children solutions. The original SolutionBatch stays unmodified.
Source code in evotorch/operators/real.py
class OnePointCrossOver(CrossOver):
"""
Representation of a one-point cross-over operator.
When this operator is applied on a SolutionBatch,
a tournament selection technique is used for selecting
parent solutions from the batch, and then those parent
solutions are mated via cutting from a random position
and recombining. The result of these recombination
operations is a new SolutionBatch, containing the children
solutions. The original SolutionBatch stays unmodified.
"""
def __init__(
self,
problem: Problem,
*,
tournament_size: int,
obj_index: Optional[int] = None,
num_children: Optional[int] = None,
cross_over_rate: Optional[float] = None,
):
"""
`__init__(...)`: Initialize the OnePointCrossOver.
Args:
problem: The problem object to work on.
tournament_size: What is the size (or length) of a tournament
when selecting a parent candidate from a population
obj_index: Objective index according to which the selection
will be done.
num_children: Optionally a number of children to produce by the
cross-over operation.
Not to be used together with `cross_over_rate`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
cross_over_rate: Optionally expected as a real number between
0.0 and 1.0. Specifies the number of cross-over operations
to perform. 1.0 means `1.0 * len(solution_batch)` amount of
cross overs will be performed, resulting in
`2.0 * len(solution_batch)` amount of children.
Not to be used together with `num_children`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
"""
super().__init__(
problem,
tournament_size=tournament_size,
obj_index=obj_index,
num_children=num_children,
cross_over_rate=cross_over_rate,
)
@torch.no_grad()
def _do_cross_over(self, parents1: torch.Tensor, parents2: torch.Tensor) -> SolutionBatch:
# What we expect here is this:
#
# parents1 parents2
# ========== ==========
# parents1[0] parents2[0]
# parents1[1] parents2[1]
# ... ...
# parents1[N] parents2[N]
#
# where parents1 and parents2 are 2D tensors, each containing values of N solutions.
# For each row i, we will apply cross-over on parents1[i] and parents2[i].
# From each cross-over, we will obtain 2 children.
# This means, there are N pairings, and 2N children.
num_pairings = parents1.shape[0]
# num_children = num_pairings * 2
device = parents1[0].device
dtype = parents1[0].dtype
solution_length = len(parents1[0])
# For each pairing, generate a gene index at which the parent solutions will be cut and recombined
crossover_point = self.problem.make_randint((num_pairings, 1), n=(solution_length - 1), device=device) + 1
# For each pairing, generate all gene indices (i.e. [0, 1, 2, ...] for each pairing)
gene_indices = (
torch.arange(0, solution_length, device=device).unsqueeze(0).expand(num_pairings, solution_length)
)
# Make a mask for crossing over. (0: take the value from one parent, 1: take the value from the other parent)
# For gene indices less than crossover_point of that pairing, the mask takes the value 0.
# Otherwise, the mask takes the value 1.
crossover_mask = (gene_indices >= crossover_point).to(dtype)
# Using the mask, generate two children.
children1 = crossover_mask * parents1 + (1 - crossover_mask) * parents2
children2 = crossover_mask * parents2 + (1 - crossover_mask) * parents1
# Combine the children tensors in one big tensor
children = torch.cat([children1, children2], dim=0)
# Write the children solutions into a new SolutionBatch, and return the new batch
result = self._make_children_batch(children)
return result
__init__(self, problem, *, tournament_size, obj_index=None, num_children=None, cross_over_rate=None)
special
¶
__init__(...)
: Initialize the OnePointCrossOver.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
The problem object to work on. |
required |
tournament_size |
int |
What is the size (or length) of a tournament when selecting a parent candidate from a population |
required |
obj_index |
Optional[int] |
Objective index according to which the selection will be done. |
None |
num_children |
Optional[int] |
Optionally a number of children to produce by the
cross-over operation.
Not to be used together with |
None |
cross_over_rate |
Optional[float] |
Optionally expected as a real number between
0.0 and 1.0. Specifies the number of cross-over operations
to perform. 1.0 means |
None |
Source code in evotorch/operators/real.py
def __init__(
self,
problem: Problem,
*,
tournament_size: int,
obj_index: Optional[int] = None,
num_children: Optional[int] = None,
cross_over_rate: Optional[float] = None,
):
"""
`__init__(...)`: Initialize the OnePointCrossOver.
Args:
problem: The problem object to work on.
tournament_size: What is the size (or length) of a tournament
when selecting a parent candidate from a population
obj_index: Objective index according to which the selection
will be done.
num_children: Optionally a number of children to produce by the
cross-over operation.
Not to be used together with `cross_over_rate`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
cross_over_rate: Optionally expected as a real number between
0.0 and 1.0. Specifies the number of cross-over operations
to perform. 1.0 means `1.0 * len(solution_batch)` amount of
cross overs will be performed, resulting in
`2.0 * len(solution_batch)` amount of children.
Not to be used together with `num_children`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
"""
super().__init__(
problem,
tournament_size=tournament_size,
obj_index=obj_index,
num_children=num_children,
cross_over_rate=cross_over_rate,
)
SimulatedBinaryCrossOver (CrossOver)
¶
Representation of a simulated binary cross-over (SBX).
When this operator is applied on a SolutionBatch, a tournament selection technique is used for selecting parent solutions from the batch, and then those parent solutions are mated via SBX. The generated children solutions are given in a new SolutionBatch. The original SolutionBatch stays unmodified.
Reference:
Kalyanmoy Deb, Hans-Georg Beyer (2001).
Self-Adaptive Genetic Algorithms with Simulated Binary Crossover.
Source code in evotorch/operators/real.py
class SimulatedBinaryCrossOver(CrossOver):
"""
Representation of a simulated binary cross-over (SBX).
When this operator is applied on a SolutionBatch,
a tournament selection technique is used for selecting
parent solutions from the batch, and then those parent
solutions are mated via SBX. The generated children
solutions are given in a new SolutionBatch.
The original SolutionBatch stays unmodified.
Reference:
Kalyanmoy Deb, Hans-Georg Beyer (2001).
Self-Adaptive Genetic Algorithms with Simulated Binary Crossover.
"""
def __init__(
self,
problem: Problem,
*,
tournament_size: int,
eta: float,
obj_index: Optional[int] = None,
num_children: Optional[int] = None,
cross_over_rate: Optional[float] = None,
):
"""
`__init__(...)`: Initialize the SimulatedBinaryCrossOver.
Args:
problem: Problem object to work with.
tournament_size: What is the size (or length) of a tournament
when selecting a parent candidate from a population.
eta: The crowding index, expected as a float.
Bigger eta values result in children closer
to their parents.
obj_index: Objective index according to which the selection
will be done.
num_children: Optionally a number of children to produce by the
cross-over operation.
Not to be used together with `cross_over_rate`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
cross_over_rate: Optionally expected as a real number between
0.0 and 1.0. Specifies the number of cross-over operations
to perform. 1.0 means `1.0 * len(solution_batch)` amount of
cross overs will be performed, resulting in
`2.0 * len(solution_batch)` amount of children.
Not to be used together with `num_children`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
"""
super().__init__(
problem,
tournament_size=int(tournament_size),
obj_index=obj_index,
num_children=num_children,
cross_over_rate=cross_over_rate,
)
self._eta = float(eta)
def _do_cross_over(self, parents1: torch.Tensor, parents2: torch.Tensor) -> SolutionBatch:
# Generate u_i values which determine the spread
u = self.problem.make_uniform_shaped_like(parents1)
# Compute beta_i values from u_i values as the actual spread per dimension
betas = (2 * u).pow(1.0 / (self._eta + 1.0)) # Compute all values for u_i < 0.5 first
betas[u > 0.5] = (1.0 / (2 * (1.0 - u[u > 0.5]))).pow(
1.0 / (self._eta + 1.0)
) # Replace the values for u_i >= 0.5
children1 = 0.5 * (
(1 + betas) * parents1 + (1 - betas) * parents2
) # Create the first set of children from the beta values
children2 = 0.5 * (
(1 + betas) * parents2 + (1 - betas) * parents1
) # Create the second set of children as a mirror of the first set of children
# Combine the children tensors in one big tensor
children = torch.cat([children1, children2], dim=0)
# Respect the lower and upper bounds defined by the problem object
children = self._respect_bounds(children)
# Write the children solutions into a new SolutionBatch, and return the new batch
result = self._make_children_batch(children)
return result
__init__(self, problem, *, tournament_size, eta, obj_index=None, num_children=None, cross_over_rate=None)
special
¶
__init__(...)
: Initialize the SimulatedBinaryCrossOver.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
problem |
Problem |
Problem object to work with. |
required |
tournament_size |
int |
What is the size (or length) of a tournament when selecting a parent candidate from a population. |
required |
eta |
float |
The crowding index, expected as a float. Bigger eta values result in children closer to their parents. |
required |
obj_index |
Optional[int] |
Objective index according to which the selection will be done. |
None |
num_children |
Optional[int] |
Optionally a number of children to produce by the
cross-over operation.
Not to be used together with |
None |
cross_over_rate |
Optional[float] |
Optionally expected as a real number between
0.0 and 1.0. Specifies the number of cross-over operations
to perform. 1.0 means |
None |
Source code in evotorch/operators/real.py
def __init__(
self,
problem: Problem,
*,
tournament_size: int,
eta: float,
obj_index: Optional[int] = None,
num_children: Optional[int] = None,
cross_over_rate: Optional[float] = None,
):
"""
`__init__(...)`: Initialize the SimulatedBinaryCrossOver.
Args:
problem: Problem object to work with.
tournament_size: What is the size (or length) of a tournament
when selecting a parent candidate from a population.
eta: The crowding index, expected as a float.
Bigger eta values result in children closer
to their parents.
obj_index: Objective index according to which the selection
will be done.
num_children: Optionally a number of children to produce by the
cross-over operation.
Not to be used together with `cross_over_rate`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
cross_over_rate: Optionally expected as a real number between
0.0 and 1.0. Specifies the number of cross-over operations
to perform. 1.0 means `1.0 * len(solution_batch)` amount of
cross overs will be performed, resulting in
`2.0 * len(solution_batch)` amount of children.
Not to be used together with `num_children`.
If `num_children` and `cross_over_rate` are both None,
then the number of children is equal to the number
of solutions received.
"""
super().__init__(
problem,
tournament_size=int(tournament_size),
obj_index=obj_index,
num_children=num_children,
cross_over_rate=cross_over_rate,
)
self._eta = float(eta)
sequence
¶
This module contains operators for problems whose solutions contain variable-length sequences (list-like objects).
CutAndSplice (CrossOver)
¶
Cut & Splice operator for variable-length solutions.
This class serves as a cross-over operator to be used on problems
with their dtype
s set as object
, and with their solutions
initialized to contain variable-length sequences (list-like objects).
Reference:
David E. Goldberg, Bradley Korb, Kalyanmoy Deb (1989).
Messy Genetic Algorithms: Motivation, Analysis, and First Results.
Complex Systems 3, 493-530.
Source code in evotorch/operators/sequence.py
class CutAndSplice(CrossOver):
"""Cut & Splice operator for variable-length solutions.
This class serves as a cross-over operator to be used on problems
with their `dtype`s set as `object`, and with their solutions
initialized to contain variable-length sequences (list-like objects).
Reference:
David E. Goldberg, Bradley Korb, Kalyanmoy Deb (1989).
Messy Genetic Algorithms: Motivation, Analysis, and First Results.
Complex Systems 3, 493-530.
"""
def _cut_and_splice(
self,
parents1: ObjectArray,
parents2: ObjectArray,
children1: SolutionBatch,
children2: SolutionBatch,
row_index: int,
):
parvals1 = parents1[row_index]
parvals2 = parents2[row_index]
length1 = len(parvals1)
length2 = len(parvals2)
cutpoint1 = int(self.problem.make_randint(tuple(), n=length1))
cutpoint2 = int(self.problem.make_randint(tuple(), n=length2))
childvals1 = parvals1[:cutpoint1]
childvals1.extend(parvals2[cutpoint2:])
childvals2 = parvals2[:cutpoint2]
childvals2.extend(parvals1[cutpoint1:])
children1.access_values(keep_evals=True)[row_index] = childvals1
children2.access_values(keep_evals=True)[row_index] = childvals2
def _do_cross_over(self, parents1: ObjectArray, parents2: ObjectArray) -> SolutionBatch:
n = len(parents1)
children1 = SolutionBatch(self.problem, popsize=n, empty=True)
children2 = SolutionBatch(self.problem, popsize=n, empty=True)
for i in range(n):
self._cut_and_splice(parents1, parents2, children1, children2, i)
return children1.concat(children2)
optimizers
¶
Optimizers (like Adam or ClipUp) to be used with distribution-based search algorithms.
Adam (TorchOptimizer)
¶
The Adam optimizer.
Reference:
Kingma, D. P. and J. Ba (2015).
Adam: A method for stochastic optimization.
In Proceedings of 3rd International Conference on Learning Representations.
Source code in evotorch/optimizers.py
class Adam(TorchOptimizer):
"""
The Adam optimizer.
Reference:
Kingma, D. P. and J. Ba (2015).
Adam: A method for stochastic optimization.
In Proceedings of 3rd International Conference on Learning Representations.
"""
def __init__(
self,
*,
solution_length: int,
dtype: DType,
device: Device = "cpu",
stepsize: Optional[float] = None,
beta1: Optional[float] = None,
beta2: Optional[float] = None,
epsilon: Optional[float] = None,
amsgrad: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the Adam optimizer.
Args:
solution_length: Length of a solution of the problem which is
being worked on.
dtype: The dtype of the problem which is being worked on.
device: The device on which the solutions are kept.
stepsize: The step size (i.e. the learning rate) employed
by the optimizer.
beta1: The beta1 hyperparameter. None means the default.
beta2: The beta2 hyperparameter. None means the default.
epsilon: The epsilon hyperparameters. None means the default.
amsgrad: Whether or not to use the amsgrad behavior.
None means the default behavior.
See `torch.optim.Adam` for details.
"""
config = {}
if stepsize is not None:
config["lr"] = float(stepsize)
if beta1 is None and beta2 is None:
pass # nothing to do
elif beta1 is not None and beta2 is not None:
config["betas"] = (float(beta1), float(beta2))
else:
raise ValueError(
"The arguments beta1 and beta2 were expected"
" as both None, or as both real numbers."
" However, one of them was encountered as None and"
" the other was encountered as something other than None."
)
if epsilon is not None:
config["eps"] = float(epsilon)
if amsgrad is not None:
config["amsgrad"] = bool(amsgrad)
super().__init__(torch.optim.Adam, solution_length=solution_length, dtype=dtype, device=device, config=config)
__init__(self, *, solution_length, dtype, device='cpu', stepsize=None, beta1=None, beta2=None, epsilon=None, amsgrad=None)
special
¶
__init__(...)
: Initialize the Adam optimizer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solution_length |
int |
Length of a solution of the problem which is being worked on. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype of the problem which is being worked on. |
required |
device |
Union[str, torch.device] |
The device on which the solutions are kept. |
'cpu' |
stepsize |
Optional[float] |
The step size (i.e. the learning rate) employed by the optimizer. |
None |
beta1 |
Optional[float] |
The beta1 hyperparameter. None means the default. |
None |
beta2 |
Optional[float] |
The beta2 hyperparameter. None means the default. |
None |
epsilon |
Optional[float] |
The epsilon hyperparameters. None means the default. |
None |
amsgrad |
Optional[bool] |
Whether or not to use the amsgrad behavior.
None means the default behavior.
See |
None |
Source code in evotorch/optimizers.py
def __init__(
self,
*,
solution_length: int,
dtype: DType,
device: Device = "cpu",
stepsize: Optional[float] = None,
beta1: Optional[float] = None,
beta2: Optional[float] = None,
epsilon: Optional[float] = None,
amsgrad: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the Adam optimizer.
Args:
solution_length: Length of a solution of the problem which is
being worked on.
dtype: The dtype of the problem which is being worked on.
device: The device on which the solutions are kept.
stepsize: The step size (i.e. the learning rate) employed
by the optimizer.
beta1: The beta1 hyperparameter. None means the default.
beta2: The beta2 hyperparameter. None means the default.
epsilon: The epsilon hyperparameters. None means the default.
amsgrad: Whether or not to use the amsgrad behavior.
None means the default behavior.
See `torch.optim.Adam` for details.
"""
config = {}
if stepsize is not None:
config["lr"] = float(stepsize)
if beta1 is None and beta2 is None:
pass # nothing to do
elif beta1 is not None and beta2 is not None:
config["betas"] = (float(beta1), float(beta2))
else:
raise ValueError(
"The arguments beta1 and beta2 were expected"
" as both None, or as both real numbers."
" However, one of them was encountered as None and"
" the other was encountered as something other than None."
)
if epsilon is not None:
config["eps"] = float(epsilon)
if amsgrad is not None:
config["amsgrad"] = bool(amsgrad)
super().__init__(torch.optim.Adam, solution_length=solution_length, dtype=dtype, device=device, config=config)
ClipUp
¶
The ClipUp optimizer.
Although this optimizer has the very same interface with SGD and Adam, it is not a PyTorch optimizer. Therefore, it does not inherit from TorchOptimizer.
Reference:
Toklu, N. E., Liskowski, P., & Srivastava, R. K. (2020, September).
ClipUp: A Simple and Powerful Optimizer for Distribution-Based Policy Evolution.
In International Conference on Parallel Problem Solving from Nature (pp. 515-527).
Springer, Cham.
Source code in evotorch/optimizers.py
class ClipUp:
"""
The ClipUp optimizer.
Although this optimizer has the very same interface with SGD and Adam,
it is not a PyTorch optimizer. Therefore, it does not inherit from
TorchOptimizer.
Reference:
Toklu, N. E., Liskowski, P., & Srivastava, R. K. (2020, September).
ClipUp: A Simple and Powerful Optimizer for Distribution-Based Policy Evolution.
In International Conference on Parallel Problem Solving from Nature (pp. 515-527).
Springer, Cham.
"""
def __init__(
self,
*,
solution_length: int,
dtype: DType,
stepsize: float,
momentum: float = 0.9,
max_speed: Optional[float] = None,
device: Device = "cpu",
):
"""
`__init__(...)`: Initialize the ClipUp optimizer.
Args:
solution_length: Length of a solution of the problem which is
being worked on.
dtype: The dtype of the problem which is being worked on.
stepsize: The step size (i.e. the learning rate) employed
by the optimizer.
momentum: The momentum coefficient. None means the default.
max_speed: The maximum speed. If given as None, the
`max_speed` will be taken as two times the stepsize.
device: The device on which the solutions are kept.
"""
stepsize = float(stepsize)
momentum = float(momentum)
if max_speed is None:
max_speed = stepsize * 2.0
else:
max_speed = float(max_speed)
solution_length = int(solution_length)
if stepsize < 0.0:
raise ValueError(f"Invalid stepsize: {stepsize}")
if momentum < 0.0 or momentum > 1.0:
raise ValueError(f"Invalid momentum: {momentum}")
if max_speed < 0.0:
raise ValueError(f"Invalid max_speed: {max_speed}")
self._stepsize = stepsize
self._momentum = momentum
self._max_speed = max_speed
self._velocity: Optional[torch.Tensor] = torch.zeros(
solution_length, dtype=to_torch_dtype(dtype), device=device
)
self._dtype = to_torch_dtype(dtype)
self._device = device
@staticmethod
def _clip(x: torch.Tensor, limit: float) -> torch.Tensor:
with torch.no_grad():
normx = torch.norm(x)
if normx > limit:
ratio = limit / normx
return x * ratio
else:
return x
@torch.no_grad()
def ascent(self, globalg: RealOrVector, *, cloned_result: bool = True) -> torch.Tensor:
"""
Compute the ascent, i.e. the step to follow.
Args:
globalg: The estimated gradient.
cloned_result: If `cloned_result` is True, then the result is a
copy, guaranteed not to be the view of any other tensor
internal to the TorchOptimizer class.
If `cloned_result` is False, then the result is not a copy.
Use `cloned_result=False` only when you are sure that your
algorithm will never do direct modification on the ascent
vector it receives.
Important: if you set `cloned_result=False`, and do in-place
modifications on the returned result of `ascent(...)`, then
the internal velocity of ClipUp will be corrupted!
Returns:
The ascent vector, representing the step to follow.
"""
globalg = ensure_tensor_length_and_dtype(
globalg,
len(self._velocity),
dtype=self._dtype,
device=self._device,
about=f"{type(self).__name__}.ascent",
)
grad = (globalg / torch.norm(globalg)) * self._stepsize
self._velocity = self._clip((self._momentum * self._velocity) + grad, self._max_speed)
result = self._velocity
if cloned_result:
result = result.clone()
return result
__init__(self, *, solution_length, dtype, stepsize, momentum=0.9, max_speed=None, device='cpu')
special
¶
__init__(...)
: Initialize the ClipUp optimizer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solution_length |
int |
Length of a solution of the problem which is being worked on. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype of the problem which is being worked on. |
required |
stepsize |
float |
The step size (i.e. the learning rate) employed by the optimizer. |
required |
momentum |
float |
The momentum coefficient. None means the default. |
0.9 |
max_speed |
Optional[float] |
The maximum speed. If given as None, the
|
None |
device |
Union[str, torch.device] |
The device on which the solutions are kept. |
'cpu' |
Source code in evotorch/optimizers.py
def __init__(
self,
*,
solution_length: int,
dtype: DType,
stepsize: float,
momentum: float = 0.9,
max_speed: Optional[float] = None,
device: Device = "cpu",
):
"""
`__init__(...)`: Initialize the ClipUp optimizer.
Args:
solution_length: Length of a solution of the problem which is
being worked on.
dtype: The dtype of the problem which is being worked on.
stepsize: The step size (i.e. the learning rate) employed
by the optimizer.
momentum: The momentum coefficient. None means the default.
max_speed: The maximum speed. If given as None, the
`max_speed` will be taken as two times the stepsize.
device: The device on which the solutions are kept.
"""
stepsize = float(stepsize)
momentum = float(momentum)
if max_speed is None:
max_speed = stepsize * 2.0
else:
max_speed = float(max_speed)
solution_length = int(solution_length)
if stepsize < 0.0:
raise ValueError(f"Invalid stepsize: {stepsize}")
if momentum < 0.0 or momentum > 1.0:
raise ValueError(f"Invalid momentum: {momentum}")
if max_speed < 0.0:
raise ValueError(f"Invalid max_speed: {max_speed}")
self._stepsize = stepsize
self._momentum = momentum
self._max_speed = max_speed
self._velocity: Optional[torch.Tensor] = torch.zeros(
solution_length, dtype=to_torch_dtype(dtype), device=device
)
self._dtype = to_torch_dtype(dtype)
self._device = device
ascent(self, globalg, *, cloned_result=True)
¶
Compute the ascent, i.e. the step to follow.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
globalg |
Union[float, Iterable[float], torch.Tensor] |
The estimated gradient. |
required |
cloned_result |
bool |
If |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ascent vector, representing the step to follow. |
Source code in evotorch/optimizers.py
@torch.no_grad()
def ascent(self, globalg: RealOrVector, *, cloned_result: bool = True) -> torch.Tensor:
"""
Compute the ascent, i.e. the step to follow.
Args:
globalg: The estimated gradient.
cloned_result: If `cloned_result` is True, then the result is a
copy, guaranteed not to be the view of any other tensor
internal to the TorchOptimizer class.
If `cloned_result` is False, then the result is not a copy.
Use `cloned_result=False` only when you are sure that your
algorithm will never do direct modification on the ascent
vector it receives.
Important: if you set `cloned_result=False`, and do in-place
modifications on the returned result of `ascent(...)`, then
the internal velocity of ClipUp will be corrupted!
Returns:
The ascent vector, representing the step to follow.
"""
globalg = ensure_tensor_length_and_dtype(
globalg,
len(self._velocity),
dtype=self._dtype,
device=self._device,
about=f"{type(self).__name__}.ascent",
)
grad = (globalg / torch.norm(globalg)) * self._stepsize
self._velocity = self._clip((self._momentum * self._velocity) + grad, self._max_speed)
result = self._velocity
if cloned_result:
result = result.clone()
return result
SGD (TorchOptimizer)
¶
The SGD optimizer.
Reference regarding the momentum behavior:
Polyak, B. T. (1964).
Some methods of speeding up the convergence of iteration methods.
USSR Computational Mathematics and Mathematical Physics, 4(5):1–17.
Reference regarding the Nesterov behavior:
Yurii Nesterov (1983).
A method for unconstrained convex minimization problem with the rate ofconvergence o(1/k2).
Doklady ANSSSR (translated as Soviet.Math.Docl.), 269:543–547.
Source code in evotorch/optimizers.py
class SGD(TorchOptimizer):
"""
The SGD optimizer.
Reference regarding the momentum behavior:
Polyak, B. T. (1964).
Some methods of speeding up the convergence of iteration methods.
USSR Computational Mathematics and Mathematical Physics, 4(5):1–17.
Reference regarding the Nesterov behavior:
Yurii Nesterov (1983).
A method for unconstrained convex minimization problem with the rate ofconvergence o(1/k2).
Doklady ANSSSR (translated as Soviet.Math.Docl.), 269:543–547.
"""
def __init__(
self,
*,
solution_length: int,
dtype: DType,
stepsize: float,
device: Device = "cpu",
momentum: Optional[float] = None,
dampening: Optional[bool] = None,
nesterov: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the SGD optimizer.
Args:
solution_length: Length of a solution of the problem which is
being worked on.
dtype: The dtype of the problem which is being worked on.
stepsize: The step size (i.e. the learning rate) employed
by the optimizer.
device: The device on which the solutions are kept.
momentum: The momentum coefficient. None means the default.
dampening: Whether or not to activate the dampening behavior.
None means the default.
See `torch.optim.SGD` for details.
nesterov: Whether or not to activate the nesterov behavior.
None means the default.
See `torch.optim.SGD` for details.
"""
config = {}
config["lr"] = float(stepsize)
if momentum is not None:
config["momentum"] = float(momentum)
if dampening is not None:
config["dampening"] = float(dampening)
if nesterov is not None:
config["nesterov"] = bool(nesterov)
super().__init__(torch.optim.SGD, solution_length=solution_length, dtype=dtype, device=device, config=config)
__init__(self, *, solution_length, dtype, stepsize, device='cpu', momentum=None, dampening=None, nesterov=None)
special
¶
__init__(...)
: Initialize the SGD optimizer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solution_length |
int |
Length of a solution of the problem which is being worked on. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype of the problem which is being worked on. |
required |
stepsize |
float |
The step size (i.e. the learning rate) employed by the optimizer. |
required |
device |
Union[str, torch.device] |
The device on which the solutions are kept. |
'cpu' |
momentum |
Optional[float] |
The momentum coefficient. None means the default. |
None |
dampening |
Optional[bool] |
Whether or not to activate the dampening behavior.
None means the default.
See |
None |
nesterov |
Optional[bool] |
Whether or not to activate the nesterov behavior.
None means the default.
See |
None |
Source code in evotorch/optimizers.py
def __init__(
self,
*,
solution_length: int,
dtype: DType,
stepsize: float,
device: Device = "cpu",
momentum: Optional[float] = None,
dampening: Optional[bool] = None,
nesterov: Optional[bool] = None,
):
"""
`__init__(...)`: Initialize the SGD optimizer.
Args:
solution_length: Length of a solution of the problem which is
being worked on.
dtype: The dtype of the problem which is being worked on.
stepsize: The step size (i.e. the learning rate) employed
by the optimizer.
device: The device on which the solutions are kept.
momentum: The momentum coefficient. None means the default.
dampening: Whether or not to activate the dampening behavior.
None means the default.
See `torch.optim.SGD` for details.
nesterov: Whether or not to activate the nesterov behavior.
None means the default.
See `torch.optim.SGD` for details.
"""
config = {}
config["lr"] = float(stepsize)
if momentum is not None:
config["momentum"] = float(momentum)
if dampening is not None:
config["dampening"] = float(dampening)
if nesterov is not None:
config["nesterov"] = bool(nesterov)
super().__init__(torch.optim.SGD, solution_length=solution_length, dtype=dtype, device=device, config=config)
TorchOptimizer
¶
Base class for using a PyTorch optimizer
Source code in evotorch/optimizers.py
class TorchOptimizer:
"""
Base class for using a PyTorch optimizer
"""
def __init__(
self,
torch_optimizer: Type,
*,
config: dict,
solution_length: int,
dtype: DType,
device: Device = "cpu",
):
"""
`__init__(...)`: Initialize the TorchOptimizer.
Args:
torch_optimizer: The class which represents a PyTorch optimizer.
config: The configuration dictionary to be passed to the optimizer
as keyword arguments.
solution_length: Length of a solution of the problem on which the
optimizer will work.
dtype: The dtype of the problem.
device: The device on which the solutions are kept.
"""
self._data = torch.empty(int(solution_length), dtype=to_torch_dtype(dtype), device=device)
self._optim = torch_optimizer([self._data], **config)
@torch.no_grad()
def ascent(self, globalg: RealOrVector, *, cloned_result: bool = True) -> torch.Tensor:
"""
Compute the ascent, i.e. the step to follow.
Args:
globalg: The estimated gradient.
cloned_result: If `cloned_result` is True, then the result is a
copy, guaranteed not to be the view of any other tensor
internal to the TorchOptimizer class.
If `cloned_result` is False, then the result is not a copy.
Use `cloned_result=False` only when you are sure that your
algorithm will never do direct modification on the ascent
vector it receives.
Returns:
The ascent vector, representing the step to follow.
"""
globalg = ensure_tensor_length_and_dtype(
globalg,
len(self._data),
dtype=self._data.dtype,
device=self._data.device,
about=f"{type(self).__name__}.ascent",
)
self._data.zero_()
self._data.grad = globalg
self._optim.step()
result = -1.0 * self._data
return result
__init__(self, torch_optimizer, *, config, solution_length, dtype, device='cpu')
special
¶
__init__(...)
: Initialize the TorchOptimizer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
torch_optimizer |
Type |
The class which represents a PyTorch optimizer. |
required |
config |
dict |
The configuration dictionary to be passed to the optimizer as keyword arguments. |
required |
solution_length |
int |
Length of a solution of the problem on which the optimizer will work. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype of the problem. |
required |
device |
Union[str, torch.device] |
The device on which the solutions are kept. |
'cpu' |
Source code in evotorch/optimizers.py
def __init__(
self,
torch_optimizer: Type,
*,
config: dict,
solution_length: int,
dtype: DType,
device: Device = "cpu",
):
"""
`__init__(...)`: Initialize the TorchOptimizer.
Args:
torch_optimizer: The class which represents a PyTorch optimizer.
config: The configuration dictionary to be passed to the optimizer
as keyword arguments.
solution_length: Length of a solution of the problem on which the
optimizer will work.
dtype: The dtype of the problem.
device: The device on which the solutions are kept.
"""
self._data = torch.empty(int(solution_length), dtype=to_torch_dtype(dtype), device=device)
self._optim = torch_optimizer([self._data], **config)
ascent(self, globalg, *, cloned_result=True)
¶
Compute the ascent, i.e. the step to follow.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
globalg |
Union[float, Iterable[float], torch.Tensor] |
The estimated gradient. |
required |
cloned_result |
bool |
If |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ascent vector, representing the step to follow. |
Source code in evotorch/optimizers.py
@torch.no_grad()
def ascent(self, globalg: RealOrVector, *, cloned_result: bool = True) -> torch.Tensor:
"""
Compute the ascent, i.e. the step to follow.
Args:
globalg: The estimated gradient.
cloned_result: If `cloned_result` is True, then the result is a
copy, guaranteed not to be the view of any other tensor
internal to the TorchOptimizer class.
If `cloned_result` is False, then the result is not a copy.
Use `cloned_result=False` only when you are sure that your
algorithm will never do direct modification on the ascent
vector it receives.
Returns:
The ascent vector, representing the step to follow.
"""
globalg = ensure_tensor_length_and_dtype(
globalg,
len(self._data),
dtype=self._data.dtype,
device=self._data.device,
about=f"{type(self).__name__}.ascent",
)
self._data.zero_()
self._data.grad = globalg
self._optim.step()
result = -1.0 * self._data
return result
get_optimizer_class(s, optimizer_config=None)
¶
Get the optimizer class from the given string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s |
str |
A string, referring to the optimizer class. "clipsgd", "clipsga", "clipup" refers to ClipUp. "adam" refers to Adam. "sgd" or "sga" refers to SGD. |
required |
optimizer_config |
Optional[dict] |
A dictionary containing the configurations to be passed to the optimizer. If this argument is not None, then, instead of the class being referred to, a dynamically generated factory function will be returned, which will pass these configurations to the actual class upon being called. |
None |
Returns:
Type | Description |
---|---|
Callable |
The class, or a factory function instantiating that class. |
Source code in evotorch/optimizers.py
def get_optimizer_class(s: str, optimizer_config: Optional[dict] = None) -> Callable:
"""
Get the optimizer class from the given string.
Args:
s: A string, referring to the optimizer class.
"clipsgd", "clipsga", "clipup" refers to ClipUp.
"adam" refers to Adam.
"sgd" or "sga" refers to SGD.
optimizer_config: A dictionary containing the configurations to be
passed to the optimizer. If this argument is not None,
then, instead of the class being referred to, a dynamically
generated factory function will be returned, which will pass
these configurations to the actual class upon being called.
Returns:
The class, or a factory function instantiating that class.
"""
if s in ("clipsgd", "clipsga", "clipup"):
cls = ClipUp
elif s == "adam":
cls = Adam
elif s in ("sgd", "sga"):
cls = SGD
else:
raise ValueError(f"Unknown optimizer: {repr(s)}")
if optimizer_config is None:
return cls
else:
def f(*args, **kwargs):
nonlocal cls, optimizer_config
conf = {}
conf.update(optimizer_config)
conf.update(kwargs)
return cls(*args, **conf)
return f
testing
¶
Utility functions for evotorch-related unit testing.
TestingError (Exception)
¶
assert_allclose(actual, desired, *, rtol=None, atol=None, equal_nan=True)
¶
This function is similar to numpy.testing.assert_allclose(...)
except
that atol
and rtol
are keyword-only arguments (which encourages
one to be more explicit when writing tests) and that the default dtype
is "float32" when the provided arguments are neither numpy arrays nor
torch tensors. Having "float32" as the default target dtype is a behavior
that is compatible with PyTorch.
This function first casts actual
into the dtype of desired
, then
uses numpy's assert_allclose(...)
for testing the closeness of the
values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
actual |
Iterable |
An iterable of numbers. |
required |
desired |
Iterable |
An iterable of numbers. These numbers represent the values
that we expect the |
required |
rtol |
Optional[float] |
Relative tolerance.
Can be left as None if only |
None |
atol |
Optional[float] |
Absolute tolerance.
Can be left as None if only |
None |
equal_nan |
bool |
If True, |
True |
Exceptions:
Type | Description |
---|---|
AssertionError |
if the numerical difference between |
TestingError |
If both |
Source code in evotorch/testing.py
def assert_allclose(
actual: Iterable,
desired: Iterable,
*,
rtol: Optional[float] = None,
atol: Optional[float] = None,
equal_nan: bool = True,
):
"""
This function is similar to `numpy.testing.assert_allclose(...)` except
that `atol` and `rtol` are keyword-only arguments (which encourages
one to be more explicit when writing tests) and that the default dtype
is "float32" when the provided arguments are neither numpy arrays nor
torch tensors. Having "float32" as the default target dtype is a behavior
that is compatible with PyTorch.
This function first casts `actual` into the dtype of `desired`, then
uses numpy's `assert_allclose(...)` for testing the closeness of the
values.
Args:
actual: An iterable of numbers.
desired: An iterable of numbers. These numbers represent the values
that we expect the `actual` to contain. If the numbers contained
by `actual` are significantly different than `desired`, the
assertion will fail.
rtol: Relative tolerance.
Can be left as None if only `atol` is to be used.
See the documentation of `numpy.testing.assert_allclose(...)`
for details about how `rtol` affects the tolerance.
atol: Absolute tolerance.
Can be left as None if only `rtol` is to be used.
See the documentation of `numpy.testing.assert_allclose(...)`
for details about how `atol` affects the tolerance.
equal_nan: If True, `nan` values will be counted as equal.
Raises:
AssertionError: if the numerical difference between `actual`
and `desired` are beyond the tolerance expressed by `atol`
and `rtol`.
TestingError: If both `rtol` and `atol` are given as None.
"""
if rtol is None and atol is None:
raise TestingError(
"Both `rtol` and `atol` were found to be None. Please either specify `rtol`, `atol`, or both."
)
elif rtol is None:
rtol = 0.0
elif atol is None:
atol = 0.0
desired = _to_numpy(desired)
actual = _to_numpy(actual, dtype=desired.dtype)
np.testing.assert_allclose(actual, desired, rtol=rtol, atol=atol, equal_nan=bool(equal_nan))
assert_almost_between(x, lb, ub, *, atol=None)
¶
Assert that the given Iterable has its values between the desired bounds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Iterable |
An Iterable containing numeric (float) values. |
required |
lb |
Union[float, Iterable] |
Lower bound for the desired interval. Can be a scalar or an iterable of values. |
required |
ub |
Union[float, Iterable] |
Upper bound for the desired interval. Can be a scalar or an iterable of values. |
required |
atol |
Optional[float] |
Absolute tolerance. If given, then the effective interval will
be |
None |
Exceptions:
Type | Description |
---|---|
AssertionError |
if any element of |
Source code in evotorch/testing.py
def assert_almost_between(
x: Iterable, lb: Union[float, Iterable], ub: Union[float, Iterable], *, atol: Optional[float] = None
):
"""
Assert that the given Iterable has its values between the desired bounds.
Args:
x: An Iterable containing numeric (float) values.
lb: Lower bound for the desired interval.
Can be a scalar or an iterable of values.
ub: Upper bound for the desired interval.
Can be a scalar or an iterable of values.
atol: Absolute tolerance. If given, then the effective interval will
be `[lb-atol; ub+atol]` instead of `[lb; ub]`.
Raises:
AssertionError: if any element of `x` violates the boundaries.
"""
x = _to_numpy(x)
lb = _to_numpy(lb)
ub = _to_numpy(ub)
if lb.shape != x.shape:
lb = np.broadcast_to(lb, x.shape)
if ub.shape != x.shape:
ub = np.broadcast_to(ub, x.shape)
lb = np.asarray(lb, dtype=x.dtype)
ub = np.asarray(ub, dtype=x.dtype)
if atol is not None:
atol = float(atol)
tolerant_lb = lb - atol
tolerant_ub = ub + atol
else:
tolerant_lb = lb
tolerant_ub = ub
assert np.all((x >= tolerant_lb) & (x <= tolerant_ub)), (
f"The provided array is not within the desired boundaries."
f"Provided array: {x}. Lower bound: {lb}. Upper bound: {ub}. Absolute tolerance: {atol}."
)
assert_dtype_matches(x, dtype)
¶
Assert that the dtype of x
is compatible with the given dtype
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Iterable |
An object with |
required |
dtype |
Union[str, Type, numpy.dtype, torch.dtype] |
The dtype which |
required |
Exceptions:
Type | Description |
---|---|
AssertionError |
if |
Source code in evotorch/testing.py
def assert_dtype_matches(x: Iterable, dtype: Union[str, Type, np.dtype, torch.dtype]):
"""
Assert that the dtype of `x` is compatible with the given `dtype`.
Args:
x: An object with `dtype` attribute (e.g. can be numpy array,
a torch tensor, an ObjectArray, a SolutionVector, etc.)
dtype: The dtype which `x` is expected to have.
Can be given as a string, as a numpy dtype, as a torch dtype,
or as a native type (e.g. int, float, bool, object).
Raises:
AssertionError: if `x` has a different dtype.
"""
actual_dtype = x.dtype
if isinstance(actual_dtype, torch.dtype):
actual_dtype = torch.tensor([], dtype=actual_dtype).numpy().dtype
else:
actual_dtype = np.dtype(actual_dtype)
if dtype == "Any" or dtype is Any:
dtype = np.dtype(object)
elif isinstance(dtype, torch.dtype):
dtype = torch.tensor([], dtype=dtype).numpy().dtype
else:
dtype = np.dtype(dtype)
assert dtype == actual_dtype, f"dtype mismatch. Encountered dtype: {actual_dtype}, expected dtype: {dtype}"
assert_eachclose(x, value, *, rtol=None, atol=None)
¶
Assert that the given tensor or array consists of a single value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Iterable |
The tensor in which each value will be compared against |
required |
value |
Any |
A scalar |
required |
Exceptions:
Type | Description |
---|---|
AssertionError |
if at least one value is different enough |
Source code in evotorch/testing.py
def assert_eachclose(x: Iterable, value: Any, *, rtol: Optional[float] = None, atol: Optional[float] = None):
"""
Assert that the given tensor or array consists of a single value.
Args:
x: The tensor in which each value will be compared against `value`
value: A scalar
Raises:
AssertionError: if at least one value is different enough
"""
# If the given scalar is not a Real, then try to cast it to float
if not isinstance(value, Real):
value = float(value)
x = _to_numpy(x)
desired = np.empty_like(x)
desired[:] = value
assert_allclose(x, desired, rtol=rtol, atol=atol)
assert_shape_matches(x, shape)
¶
Assert that the dtype of x
matches the given shape
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Iterable |
An object which can be converted to a PyTorch tensor. |
required |
shape |
Union[tuple, int] |
A tuple, or a torch.Size, or an integer. |
required |
Exceptions:
Type | Description |
---|---|
AssertionError |
if there is a shape mismatch. |
Source code in evotorch/testing.py
def assert_shape_matches(x: Iterable, shape: Union[tuple, int]):
"""
Assert that the dtype of `x` matches the given shape
Args:
x: An object which can be converted to a PyTorch tensor.
shape: A tuple, or a torch.Size, or an integer.
Raises:
AssertionError: if there is a shape mismatch.
"""
if isinstance(x, torch.Tensor):
pass # nothing to do
elif isinstance(x, np.ndarray):
x = torch.from_numpy(x)
else:
x = torch.tensor(x)
if not isinstance(shape, Iterable):
shape = (int(shape),)
assert x.shape == shape, f"Encountered a shape mismatch. Shape of the tensor: {x.shape}. Expected shape: {shape}"
tools
special
¶
This namespace contains various utility functions, classes, and type aliases.
hook
¶
This module contains the Hook class, which is used for event handling, and for defining additional behaviors to the class instances which own the Hook.
Hook (MutableSequence)
¶
A Hook stores a list of callable objects to be called for handling certain events. A Hook itself is callable, which invokes the callables stored in its list. If the callables stored by the Hook return list-like objects or dict-like objects, their returned results are accumulated, and then those accumulated results are finally returned by the Hook.
Source code in evotorch/tools/hook.py
class Hook(MutableSequence):
"""
A Hook stores a list of callable objects to be called for handling
certain events. A Hook itself is callable, which invokes the callables
stored in its list. If the callables stored by the Hook return list-like
objects or dict-like objects, their returned results are accumulated,
and then those accumulated results are finally returned by the Hook.
"""
def __init__(
self,
callables: Optional[Iterable[Callable]] = None,
*,
args: Optional[Iterable] = None,
kwargs: Optional[Mapping] = None,
):
"""
Initialize the Hook.
Args:
callables: A sequence of callables to be stored by the Hook.
args: Positional arguments which, when the Hook is called,
are to be passed to every callable stored by the Hook.
Please note that these positional arguments will be passed
as the leftmost arguments, and, the other positional
arguments passed via the `__call__(...)` method of the
Hook will be added to the right of these arguments.
kwargs: Keyword arguments which, when the Hook is called,
are to be passed to every callable stored by the Hook.
Please note that these keyword arguments could be overriden
by the keyword arguments passed via the `__call__(...)`
method of the Hook.
"""
self._funcs: list = [] if callables is None else list(callables)
self._args: list = [] if args is None else list(args)
self._kwargs: dict = {} if kwargs is None else dict(kwargs)
def __call__(self, *args: Any, **kwargs: Any) -> Optional[Union[dict, list]]:
"""
Call every callable object stored by the Hook.
The results of the stored callable objects (which can be dict-like
or list-like objects) are accumulated and finally returned.
Args:
args: Additional positional arguments to be passed to the stored
callables.
kwargs: Additional keyword arguments to be passed to the stored
keyword arguments.
"""
all_args = []
all_args.extend(self._args)
all_args.extend(args)
all_kwargs = {}
all_kwargs.update(self._kwargs)
all_kwargs.update(kwargs)
result: Optional[Union[dict, list]] = None
for f in self._funcs:
tmp = f(*all_args, **all_kwargs)
if tmp is not None:
if isinstance(tmp, Mapping):
if result is None:
result = dict(tmp)
elif isinstance(result, list):
raise TypeError(
f"The function {f} returned a dict-like object."
f" However, previous function(s) in this hook had returned list-like object(s)."
f" Such incompatible results cannot be accumulated."
)
elif isinstance(result, dict):
result.update(tmp)
else:
raise RuntimeError
elif isinstance(tmp, Iterable):
if result is None:
result = list(tmp)
elif isinstance(result, list):
result.extend(tmp)
elif isinstance(result, dict):
raise TypeError(
f"The function {f} returned a list-like object."
f" However, previous function(s) in this hook had returned dict-like object(s)."
f" Such incompatible results cannot be accumulated."
)
else:
raise RuntimeError
else:
raise TypeError(
f"Expected the function {f} to return None, or a dict-like object, or a list-like object."
f" However, the function returned an object of type {repr(type(tmp))}."
)
return result
def accumulate_dict(self, *args: Any, **kwargs: Any) -> Optional[Union[dict, list]]:
result = self(*args, **kwargs)
if result is None:
return {}
elif isinstance(result, Mapping):
return result
else:
raise TypeError(
f"Expected the functions in this hook to accumulate"
f" dictionary-like objects. Instead, accumulated"
f" an object of type {type(result)}."
f" Hint: are the functions registered in this hook"
f" returning non-dictionary iterables?"
)
def accumulate_sequence(self, *args: Any, **kwargs: Any) -> Optional[Union[dict, list]]:
result = self(*args, **kwargs)
if result is None:
return []
elif isinstance(result, Mapping):
raise TypeError(
f"Expected the functions in this hook to accumulate"
f" sequences (that are NOT dictionaries). Instead, accumulated"
f" a dict-like object of type {type(result)}."
f" Hint: are the functions registered in this hook"
f" returning objects with Mapping interface?"
)
else:
return result
def _to_string(self) -> str:
init_args = [repr(self._funcs)]
if len(self._args) > 0:
init_args.append(f"args={self._args}")
if len(self._kwargs) > 0:
init_args.append(f"kwargs={self._kwargs}")
s_init_args = ", ".join(init_args)
return f"{type(self).__name__}({s_init_args})"
def __repr__(self) -> str:
return self._to_string()
def __str__(self) -> str:
return self._to_string()
def __getitem__(self, i: Union[int, slice]) -> Union[Callable, "Hook"]:
if isinstance(i, slice):
return Hook(self._funcs[i], args=self._args, kwargs=self._kwargs)
else:
return self._funcs[i]
def __setitem__(self, i: Union[int, slice], x: Iterable[Callable]):
self._funcs[i] = x
def __delitem__(self, i: Union[int, slice]):
del self._funcs[i]
def insert(self, i: int, x: Callable):
self._funcs.insert(i, x)
def __len__(self) -> int:
return len(self._funcs)
@property
def args(self) -> list:
"""Positional arguments that will be passed to the stored callables"""
return self._args
@property
def kwargs(self) -> dict:
"""Keyword arguments that will be passed to the stored callables"""
return self._kwargs
args: list
property
readonly
¶
Positional arguments that will be passed to the stored callables
kwargs: dict
property
readonly
¶
Keyword arguments that will be passed to the stored callables
__call__(self, *args, **kwargs)
special
¶
Call every callable object stored by the Hook. The results of the stored callable objects (which can be dict-like or list-like objects) are accumulated and finally returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
args |
Any |
Additional positional arguments to be passed to the stored callables. |
() |
kwargs |
Any |
Additional keyword arguments to be passed to the stored keyword arguments. |
{} |
Source code in evotorch/tools/hook.py
def __call__(self, *args: Any, **kwargs: Any) -> Optional[Union[dict, list]]:
"""
Call every callable object stored by the Hook.
The results of the stored callable objects (which can be dict-like
or list-like objects) are accumulated and finally returned.
Args:
args: Additional positional arguments to be passed to the stored
callables.
kwargs: Additional keyword arguments to be passed to the stored
keyword arguments.
"""
all_args = []
all_args.extend(self._args)
all_args.extend(args)
all_kwargs = {}
all_kwargs.update(self._kwargs)
all_kwargs.update(kwargs)
result: Optional[Union[dict, list]] = None
for f in self._funcs:
tmp = f(*all_args, **all_kwargs)
if tmp is not None:
if isinstance(tmp, Mapping):
if result is None:
result = dict(tmp)
elif isinstance(result, list):
raise TypeError(
f"The function {f} returned a dict-like object."
f" However, previous function(s) in this hook had returned list-like object(s)."
f" Such incompatible results cannot be accumulated."
)
elif isinstance(result, dict):
result.update(tmp)
else:
raise RuntimeError
elif isinstance(tmp, Iterable):
if result is None:
result = list(tmp)
elif isinstance(result, list):
result.extend(tmp)
elif isinstance(result, dict):
raise TypeError(
f"The function {f} returned a list-like object."
f" However, previous function(s) in this hook had returned dict-like object(s)."
f" Such incompatible results cannot be accumulated."
)
else:
raise RuntimeError
else:
raise TypeError(
f"Expected the function {f} to return None, or a dict-like object, or a list-like object."
f" However, the function returned an object of type {repr(type(tmp))}."
)
return result
__init__(self, callables=None, *, args=None, kwargs=None)
special
¶
Initialize the Hook.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
callables |
Optional[Iterable[Callable]] |
A sequence of callables to be stored by the Hook. |
None |
args |
Optional[Iterable] |
Positional arguments which, when the Hook is called,
are to be passed to every callable stored by the Hook.
Please note that these positional arguments will be passed
as the leftmost arguments, and, the other positional
arguments passed via the |
None |
kwargs |
Optional[collections.abc.Mapping] |
Keyword arguments which, when the Hook is called,
are to be passed to every callable stored by the Hook.
Please note that these keyword arguments could be overriden
by the keyword arguments passed via the |
None |
Source code in evotorch/tools/hook.py
def __init__(
self,
callables: Optional[Iterable[Callable]] = None,
*,
args: Optional[Iterable] = None,
kwargs: Optional[Mapping] = None,
):
"""
Initialize the Hook.
Args:
callables: A sequence of callables to be stored by the Hook.
args: Positional arguments which, when the Hook is called,
are to be passed to every callable stored by the Hook.
Please note that these positional arguments will be passed
as the leftmost arguments, and, the other positional
arguments passed via the `__call__(...)` method of the
Hook will be added to the right of these arguments.
kwargs: Keyword arguments which, when the Hook is called,
are to be passed to every callable stored by the Hook.
Please note that these keyword arguments could be overriden
by the keyword arguments passed via the `__call__(...)`
method of the Hook.
"""
self._funcs: list = [] if callables is None else list(callables)
self._args: list = [] if args is None else list(args)
self._kwargs: dict = {} if kwargs is None else dict(kwargs)
insert(self, i, x)
¶
misc
¶
Miscellaneous utility functions
DTypeAndDevice (tuple)
¶
ErroneousResult
¶
Representation of a caught error being returned as a result.
Source code in evotorch/tools/misc.py
class ErroneousResult:
"""
Representation of a caught error being returned as a result.
"""
def __init__(self, error: Exception):
self.error = error
def _to_string(self) -> str:
return f"<{type(self).__name__}, error: {self.error}>"
def __str__(self) -> str:
return self._to_string()
def __repr__(self) -> str:
return self._to_string()
def __bool__(self) -> bool:
return False
@staticmethod
def call(f, *args, **kwargs) -> Any:
"""
Call a function with the given arguments.
If the function raises an error, wrap the error in an ErroneousResult
object, and return that ErroneousResult object instead.
Returns:
The result of the function if there was no error,
or an ErroneousResult if there was an error.
"""
try:
result = f(*args, **kwargs)
except Exception as ex:
result = ErroneousResult(ex)
return result
call(f, *args, **kwargs)
staticmethod
¶
Call a function with the given arguments. If the function raises an error, wrap the error in an ErroneousResult object, and return that ErroneousResult object instead.
Returns:
Type | Description |
---|---|
Any |
The result of the function if there was no error, or an ErroneousResult if there was an error. |
Source code in evotorch/tools/misc.py
@staticmethod
def call(f, *args, **kwargs) -> Any:
"""
Call a function with the given arguments.
If the function raises an error, wrap the error in an ErroneousResult
object, and return that ErroneousResult object instead.
Returns:
The result of the function if there was no error,
or an ErroneousResult if there was an error.
"""
try:
result = f(*args, **kwargs)
except Exception as ex:
result = ErroneousResult(ex)
return result
as_tensor(x, *, dtype=None, device=None)
¶
Get the tensor counterpart of the given object x
.
This function can be used to convert native Python objects to tensors:
my_tensor = as_tensor([1.0, 2.0, 3.0], dtype="float32")
One can also use this function to convert an existing tensor to another dtype:
my_new_tensor = as_tensor(my_tensor, dtype="float16")
This function can also be used for moving a tensor from one device to another:
my_gpu_tensor = as_tensor(my_tensor, device="cuda:0")
This function can also create ObjectArray instances when dtype is
given as object
or Any
or "object" or "O".
my_objects = as_tensor([1, {"a": 3}], dtype=object)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
Any object to be converted to a tensor. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an |
None |
device |
Union[str, torch.device] |
The device in which the resulting tensor will be stored. |
None |
Returns:
Type | Description |
---|---|
Iterable |
The tensor counterpart of the given object |
Source code in evotorch/tools/misc.py
def as_tensor(x: Any, *, dtype: Optional[DType] = None, device: Optional[Device] = None) -> Iterable:
"""
Get the tensor counterpart of the given object `x`.
This function can be used to convert native Python objects to tensors:
my_tensor = as_tensor([1.0, 2.0, 3.0], dtype="float32")
One can also use this function to convert an existing tensor to another
dtype:
my_new_tensor = as_tensor(my_tensor, dtype="float16")
This function can also be used for moving a tensor from one device to
another:
my_gpu_tensor = as_tensor(my_tensor, device="cuda:0")
This function can also create ObjectArray instances when dtype is
given as `object` or `Any` or "object" or "O".
my_objects = as_tensor([1, {"a": 3}], dtype=object)
Args:
x: Any object to be converted to a tensor.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an `ObjectArray`,
"object" (as string) or `object` or `Any`.
If `dtype` is not specified, the default behavior of
`torch.as_tensor(...)` will be used, that is, dtype will be
inferred from `x`.
device: The device in which the resulting tensor will be stored.
Returns:
The tensor counterpart of the given object `x`.
"""
from .objectarray import ObjectArray
if (dtype is None) and isinstance(x, (torch.Tensor, ObjectArray)):
if (device is None) or (str(device) == "cpu"):
return x
else:
raise ValueError(
f"An ObjectArray cannot be moved into a device other than 'cpu'." f" The received device is: {device}."
)
elif is_dtype_object(dtype):
if (device is None) or (str(device) == "cpu"):
raise ValueError(
f"An ObjectArray cannot be created on a device other than 'cpu'." f" The received device is: {device}."
)
if isinstance(x, ObjectArray):
return x
else:
x = list(x)
n = len(x)
result = ObjectArray(n)
result[:] = x
return result
else:
dtype = to_torch_dtype(dtype)
return torch.as_tensor(x, dtype=dtype, device=device)
clip_tensor(x, lb=None, ub=None, ensure_copy=True)
¶
Clip the values of a tensor with respect to the given bounds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor |
The PyTorch tensor whose values will be clipped. |
required |
lb |
Union[float, Iterable] |
Lower bounds, as a PyTorch tensor. Can be None if there are no lower bounds. |
None |
ub |
Union[float, Iterable] |
Upper bounds, as a PyTorch tensor. Can be None if there are no upper bonuds. |
None |
ensure_copy |
bool |
If |
True |
Returns:
Type | Description |
---|---|
Tensor |
The clipped tensor. |
Source code in evotorch/tools/misc.py
@torch.no_grad()
def clip_tensor(
x: torch.Tensor,
lb: Optional[Union[float, Iterable]] = None,
ub: Optional[Union[float, Iterable]] = None,
ensure_copy: bool = True,
) -> torch.Tensor:
"""
Clip the values of a tensor with respect to the given bounds.
Args:
x: The PyTorch tensor whose values will be clipped.
lb: Lower bounds, as a PyTorch tensor.
Can be None if there are no lower bounds.
ub: Upper bounds, as a PyTorch tensor.
Can be None if there are no upper bonuds.
ensure_copy: If `ensure_copy` is True, the result will be
a clipped copy of the original tensor.
If `ensure_copy` is False, and both `lb` and `ub`
are None, then there is nothing to do, so, the result
will be the original tensor itself, not a copy of it.
Returns:
The clipped tensor.
"""
result = x
if lb is not None:
lb = torch.as_tensor(lb, dtype=x.dtype, device=x.device)
result = torch.max(result, lb)
if ub is not None:
ub = torch.as_tensor(ub, dtype=x.dtype, device=x.device)
result = torch.min(result, ub)
if ensure_copy and result is x:
result = x.clone()
return result
clone(x, *, memo=None)
¶
Get a deep copy of the given object.
The cloning is done in no_grad mode.
Returns:
Type | Description |
---|---|
Any |
The deep copy of the given object. |
device_of(x)
¶
Get the device of the given object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
The object whose device is being queried.
The object can be a PyTorch tensor, or a PyTorch module
(in which case the device of the first parameter tensor
will be returned), or an ObjectArray (in which case
the returned device will be the cpu device), or any object
with the attribute |
required |
Returns:
Type | Description |
---|---|
Union[str, torch.device] |
The device of the given object. |
Source code in evotorch/tools/misc.py
def device_of(x: Any) -> Device:
"""
Get the device of the given object.
Args:
x: The object whose device is being queried.
The object can be a PyTorch tensor, or a PyTorch module
(in which case the device of the first parameter tensor
will be returned), or an ObjectArray (in which case
the returned device will be the cpu device), or any object
with the attribute `device`.
Returns:
The device of the given object.
"""
if isinstance(x, nn.Module):
result = None
for param in x.parameters():
result = param.device
break
if result is None:
raise ValueError(f"Cannot determine the device of the module {x}")
return result
else:
return x.device
device_of_container(container)
¶
Get the device of the given container.
It is assumed that the given container stores PyTorch tensors from which the device information will be extracted. If the container contains only basic types like int, float, string, bool, or None, or if the container is empty, then the returned device will be None. If the container contains unrecognized objects, an error will be raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
container |
Any |
A sequence or a dictionary of objects from which the device information will be extracted. |
required |
Returns:
Type | Description |
---|---|
Optional[torch.device] |
The device if available, None otherwise. |
Source code in evotorch/tools/misc.py
def device_of_container(container: Any) -> Optional[torch.device]:
"""
Get the device of the given container.
It is assumed that the given container stores PyTorch tensors from
which the device information will be extracted.
If the container contains only basic types like int, float, string,
bool, or None, or if the container is empty, then the returned device
will be None.
If the container contains unrecognized objects, an error will be
raised.
Args:
container: A sequence or a dictionary of objects from which the
device information will be extracted.
Returns:
The device if available, None otherwise.
"""
class result:
device: Optional[torch.device] = None
@classmethod
def update(cls, new_device: Optional[torch.device]):
if new_device is not None:
if cls.device is None:
cls.device = new_device
else:
if new_device != cls.device:
raise ValueError(f"Encountered tensors whose `device`s mismatch: {new_device}, {cls.device}")
if isinstance(container, torch.Tensor):
result.update(container.device)
elif (container is None) or isinstance(container, (Number, str, bytes, bool)):
pass
elif isinstance(container, Mapping):
for _, v in container.items():
result.update(device_of_container(v))
elif isinstance(container, Iterable):
for v in container:
result.update(device_of_container(v))
else:
raise TypeError(f"Encountered an object of unrecognized type: {type(container)}")
return result.device
dtype_of(x)
¶
Get the dtype of the given object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
The object whose dtype is being queried.
The object can be a PyTorch tensor, or a PyTorch module
(in which case the dtype of the first parameter tensor
will be returned), or an ObjectArray (in which case
the returned dtype will be |
required |
Returns:
Type | Description |
---|---|
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype of the given object. |
Source code in evotorch/tools/misc.py
def dtype_of(x: Any) -> DType:
"""
Get the dtype of the given object.
Args:
x: The object whose dtype is being queried.
The object can be a PyTorch tensor, or a PyTorch module
(in which case the dtype of the first parameter tensor
will be returned), or an ObjectArray (in which case
the returned dtype will be `object`), or any object with
the attribute `dtype`.
Returns:
The dtype of the given object.
"""
if isinstance(x, nn.Module):
result = None
for param in x.parameters():
result = param.dtype
break
if result is None:
raise ValueError(f"Cannot determine the dtype of the module {x}")
return result
else:
return x.dtype
dtype_of_container(container)
¶
Get the dtype of the given container.
It is assumed that the given container stores PyTorch tensors from which the dtype information will be extracted. If the container contains only basic types like int, float, string, bool, or None, or if the container is empty, then the returned dtype will be None. If the container contains unrecognized objects, an error will be raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
container |
Any |
A sequence or a dictionary of objects from which the dtype information will be extracted. |
required |
Returns:
Type | Description |
---|---|
Optional[torch.dtype] |
The dtype if available, None otherwise. |
Source code in evotorch/tools/misc.py
def dtype_of_container(container: Any) -> Optional[torch.dtype]:
"""
Get the dtype of the given container.
It is assumed that the given container stores PyTorch tensors from
which the dtype information will be extracted.
If the container contains only basic types like int, float, string,
bool, or None, or if the container is empty, then the returned dtype
will be None.
If the container contains unrecognized objects, an error will be
raised.
Args:
container: A sequence or a dictionary of objects from which the
dtype information will be extracted.
Returns:
The dtype if available, None otherwise.
"""
class result:
dtype: Optional[torch.dtype] = None
@classmethod
def update(cls, new_dtype: Optional[torch.dtype]):
if new_dtype is not None:
if cls.dtype is None:
cls.dtype = new_dtype
else:
if new_dtype != cls.dtype:
raise ValueError(f"Encountered tensors whose `dtype`s mismatch: {new_dtype}, {cls.dtype}")
if isinstance(container, torch.Tensor):
result.update(container.dtype)
elif (container is None) or isinstance(container, (Number, str, bytes, bool)):
pass
elif isinstance(container, Mapping):
for _, v in container.items():
result.update(dtype_of_container(v))
elif isinstance(container, Iterable):
for v in container:
result.update(dtype_of_container(v))
else:
raise TypeError(f"Encountered an object of unrecognized type: {type(container)}")
return result.dtype
empty_tensor_like(source, *, shape=None, length=None, dtype=None, device=None)
¶
Make an empty tensor with attributes taken from a source tensor.
The source tensor can be a PyTorch tensor, or an ObjectArray.
Unlike torch.empty_like(...)
, this function allows one to redefine the
shape and/or length of the new empty tensor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source |
Any |
The source tensor whose shape, dtype, and device will be used by default for the new empty tensor. |
required |
shape |
Union[tuple, int] |
If given as None (which is the default), then the shape of the
source tensor will be used for the new empty tensor.
If given as a tuple or a |
None |
length |
Optional[int] |
If given as None (which is the default), then the length of
the new empty tensor will be equal to the length of the source
tensor (where length here means the size of the outermost
dimension, i.e., what is returned by |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
If given as None, the dtype of the new empty tensor will be
the dtype of the source tensor.
If given as a |
None |
device |
Union[str, torch.device] |
If given as None, the device of the new empty tensor will be
the device of the source tensor.
If given as a |
None |
Returns:
Type | Description |
---|---|
Any |
The new empty tensor. |
Source code in evotorch/tools/misc.py
def empty_tensor_like(
source: Any,
*,
shape: Optional[Union[tuple, int]] = None,
length: Optional[int] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
) -> Any:
"""
Make an empty tensor with attributes taken from a source tensor.
The source tensor can be a PyTorch tensor, or an ObjectArray.
Unlike `torch.empty_like(...)`, this function allows one to redefine the
shape and/or length of the new empty tensor.
Args:
source: The source tensor whose shape, dtype, and device will be used
by default for the new empty tensor.
shape: If given as None (which is the default), then the shape of the
source tensor will be used for the new empty tensor.
If given as a tuple or a `torch.Size` instance, then the new empty
tensor will be in this given shape instead.
This argument cannot be used together with `length`.
length: If given as None (which is the default), then the length of
the new empty tensor will be equal to the length of the source
tensor (where length here means the size of the outermost
dimension, i.e., what is returned by `len(...)`).
If given as an integer, the length of the empty tensor will be
this given length instead.
This argument cannot be used together with `shape`.
dtype: If given as None, the dtype of the new empty tensor will be
the dtype of the source tensor.
If given as a `torch.dtype` instance, then the dtype of the
tensor will be this given dtype instead.
device: If given as None, the device of the new empty tensor will be
the device of the source tensor.
If given as a `torch.device` instance, then the device of the
tensor will be this given device instead.
Returns:
The new empty tensor.
"""
from .objectarray import ObjectArray
if isinstance(source, ObjectArray):
if length is not None and shape is None:
n = int(length)
elif shape is not None and length is None:
if isinstance(shape, Iterable):
if len(shape) != 1:
raise ValueError(
f"An ObjectArray must always be 1-dimensional."
f" Therefore, this given shape is incompatible: {shape}"
)
n = int(shape[0])
elif length is None and shape is None:
n = len(source)
else:
raise ValueError("`length` and `shape` cannot be used together")
if device is not None:
if str(device) != "cpu":
raise ValueError(
f"An ObjectArray can only be allocated on cpu. However, the specified `device` is: {device}."
)
if dtype is not None:
if not is_dtype_object(dtype):
raise ValueError(
f"The dtype of an ObjectArray can only be `object`. However, the specified `dtype` is: {dtype}."
)
return ObjectArray(n)
elif isinstance(source, torch.Tensor):
if length is not None:
if shape is not None:
raise ValueError("`length` and `shape` cannot be used together")
if source.ndim == 0:
raise ValueError("`length` can only be used when the source tensor is at least 1-dimensional")
newshape = [int(length)]
newshape.extend(source.shape[1:])
shape = tuple(newshape)
if not ((dtype is None) or isinstance(dtype, torch.dtype)):
dtype = to_torch_dtype(dtype)
return torch.empty(
source.shape if shape is None else shape,
dtype=(source.dtype if dtype is None else dtype),
device=(source.device if device is None else device),
)
else:
raise TypeError(f"The source tensor is of an unrecognized type: {type(source)}")
ensure_ray()
¶
Ensure that the ray parallelization engine is initialized. If ray is already initialized, this function does nothing.
ensure_tensor_length_and_dtype(t, length, dtype, about=None, *, allow_scalar=False, device=None)
¶
Return the given sequence as a tensor while also confirming its length, dtype, and device. If the given object is already a tensor conforming to the desired length, dtype, and device, the object will be returned as it is (there will be no copying).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Any |
The tensor, or a sequence which is convertible to a tensor. |
required |
length |
int |
The length to which the tensor is expected to conform. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype to which the tensor is expected to conform. |
required |
about |
Optional[str] |
The prefix for the error message. Can be left as None. |
None |
allow_scalar |
bool |
Whether or not to accept scalars in addition
to vector of the desired length.
If |
False |
device |
Union[str, torch.device] |
The device in which the sequence is to be stored.
If the given sequence is on a different device than the
desired device, a copy on the correct device will be made.
If device is None, the default behavior of |
None |
Returns:
Type | Description |
---|---|
The sequence whose correctness in terms of length, dtype, and device is ensured. |
Exceptions:
Type | Description |
---|---|
ValueError |
if there is a length mismatch. |
Source code in evotorch/tools/misc.py
@torch.no_grad()
def ensure_tensor_length_and_dtype(
t: Any,
length: int,
dtype: DType,
about: Optional[str] = None,
*,
allow_scalar: bool = False,
device: Optional[Device] = None,
):
"""
Return the given sequence as a tensor while also confirming its
length, dtype, and device.
If the given object is already a tensor conforming to the desired
length, dtype, and device, the object will be returned as it is
(there will be no copying).
Args:
t: The tensor, or a sequence which is convertible to a tensor.
length: The length to which the tensor is expected to conform.
dtype: The dtype to which the tensor is expected to conform.
about: The prefix for the error message. Can be left as None.
allow_scalar: Whether or not to accept scalars in addition
to vector of the desired length.
If `allow_scalar` is False, then scalars will be converted
to sequences of the desired length. The sequence will contain
the same scalar, repeated.
If `allow_scalar` is True, then the scalar itself will be
converted to a PyTorch scalar, and then will be returned.
device: The device in which the sequence is to be stored.
If the given sequence is on a different device than the
desired device, a copy on the correct device will be made.
If device is None, the default behavior of `torch.tensor(...)`
will be used, that is: if `t` is already a tensor, the result
will be on the same device, otherwise, the result will be on
the cpu.
Returns:
The sequence whose correctness in terms of length, dtype, and
device is ensured.
Raises:
ValueError: if there is a length mismatch.
"""
device_args = {}
if device is not None:
device_args["device"] = device
t = as_tensor(t, dtype=dtype, **device_args)
if t.ndim == 0:
if allow_scalar:
return t
else:
return t.repeat(length)
else:
if t.ndim != 1 or len(t) != length:
if about is not None:
err_prefix = about + ": "
else:
err_prefix = ""
raise ValueError(
f"{err_prefix}Expected a 1-dimensional tensor of length {length}, but got a tensor with shape: {t.shape}"
)
return t
expect_none(msg_prefix, **kwargs)
¶
Expect the values associated with the given keyword arguments to be None. If not, raise error.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
msg_prefix |
str |
Prefix of the error message. |
required |
kwargs |
Keyword arguments whose values are expected to be None. |
{} |
Exceptions:
Type | Description |
---|---|
ValueError |
if at least one of the keyword arguments has a value other than None. |
Source code in evotorch/tools/misc.py
def expect_none(msg_prefix: str, **kwargs):
"""
Expect the values associated with the given keyword arguments
to be None. If not, raise error.
Args:
msg_prefix: Prefix of the error message.
kwargs: Keyword arguments whose values are expected to be None.
Raises:
ValueError: if at least one of the keyword arguments has a value
other than None.
"""
for k, v in kwargs.items():
if v is not None:
raise ValueError(f"{msg_prefix}: expected `{k}` as None, however, it was found to be {repr(v)}")
is_bool(x)
¶
Return True if x
represents a bool.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
An object whose type is being queried. |
required |
Returns:
Type | Description |
---|---|
bool |
True if |
Source code in evotorch/tools/misc.py
def is_bool(x: Any) -> bool:
"""
Return True if `x` represents a bool.
Args:
x: An object whose type is being queried.
Returns:
True if `x` is a bool; False otherwise.
"""
if isinstance(x, (bool, np.bool_)):
return True
elif isinstance(x, (torch.Tensor, np.ndarray)):
if x.ndim > 0:
return False
else:
return is_dtype_bool(x.dtype)
else:
return False
is_bool_vector(x)
¶
Return True if x
is a vector consisting of bools.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
An object whose elements' types are to be queried. |
required |
Returns:
Type | Description |
---|---|
True if the elements of |
Source code in evotorch/tools/misc.py
def is_bool_vector(x: Any):
"""
Return True if `x` is a vector consisting of bools.
Args:
x: An object whose elements' types are to be queried.
Returns:
True if the elements of `x` are bools; False otherwise.
"""
if isinstance(x, (torch.Tensor, np.ndarray)):
if x.ndim != 1:
return False
else:
return is_dtype_bool(x.dtype)
elif isinstance(x, Iterable):
for item in x:
if not is_bool(item):
return False
return True
else:
return False
is_dtype_bool(t)
¶
Return True if the given dtype is an bool type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype, which can be a dtype string, a numpy dtype, or a PyTorch dtype. |
required |
Returns:
Type | Description |
---|---|
bool |
True if t is a bool type; False otherwise. |
Source code in evotorch/tools/misc.py
is_dtype_float(t)
¶
Return True if the given dtype is an float type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype, which can be a dtype string, a numpy dtype, or a PyTorch dtype. |
required |
Returns:
Type | Description |
---|---|
bool |
True if t is an float type; False otherwise. |
Source code in evotorch/tools/misc.py
is_dtype_integer(t)
¶
Return True if the given dtype is an integer type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype, which can be a dtype string, a numpy dtype, or a PyTorch dtype. |
required |
Returns:
Type | Description |
---|---|
bool |
True if t is an integer type; False otherwise. |
Source code in evotorch/tools/misc.py
def is_dtype_integer(t: DType) -> bool:
"""
Return True if the given dtype is an integer type.
Args:
t: The dtype, which can be a dtype string, a numpy dtype,
or a PyTorch dtype.
Returns:
True if t is an integer type; False otherwise.
"""
t: np.dtype = to_numpy_dtype(t)
return t.kind.startswith("u") or t.kind.startswith("i")
is_dtype_object(dtype)
¶
Return True if the given dtype is object
or Any
.
Returns:
Type | Description |
---|---|
bool |
True if the given dtype is |
Source code in evotorch/tools/misc.py
def is_dtype_object(dtype: DType) -> bool:
"""
Return True if the given dtype is `object` or `Any`.
Returns:
True if the given dtype is `object` or `Any`; False otherwise.
"""
if isinstance(dtype, str):
return dtype in ("object", "Any", "O")
elif dtype is object or dtype is Any:
return True
else:
return False
is_dtype_real(t)
¶
Return True if the given dtype represents real numbers (i.e. if dtype is an integer type or is a float type).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype, which can be a dtype string, a numpy dtype, or a PyTorch dtype. |
required |
Returns:
Type | Description |
---|---|
bool |
True if t represents a real numbers type; False otherwise. |
Source code in evotorch/tools/misc.py
def is_dtype_real(t: DType) -> bool:
"""
Return True if the given dtype represents real numbers
(i.e. if dtype is an integer type or is a float type).
Args:
t: The dtype, which can be a dtype string, a numpy dtype,
or a PyTorch dtype.
Returns:
True if t represents a real numbers type; False otherwise.
"""
return is_dtype_float(t) or is_dtype_integer(t)
is_integer(x)
¶
Return True if x
is an integer.
Note that this function does NOT consider booleans as integers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
An object whose type is being queried. |
required |
Returns:
Type | Description |
---|---|
bool |
True if |
Source code in evotorch/tools/misc.py
def is_integer(x: Any) -> bool:
"""
Return True if `x` is an integer.
Note that this function does NOT consider booleans as integers.
Args:
x: An object whose type is being queried.
Returns:
True if `x` is an integer; False otherwise.
"""
if is_bool(x):
return False
elif isinstance(x, Integral):
return True
elif isinstance(x, (torch.Tensor, np.ndarray)):
if x.ndim > 0:
return False
else:
return is_dtype_integer(x.dtype)
else:
return False
is_integer_vector(x)
¶
Return True if x
is a vector consisting of integers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
An object whose elements' types are to be queried. |
required |
Returns:
Type | Description |
---|---|
True if the elements of |
Source code in evotorch/tools/misc.py
def is_integer_vector(x: Any):
"""
Return True if `x` is a vector consisting of integers.
Args:
x: An object whose elements' types are to be queried.
Returns:
True if the elements of `x` are integers; False otherwise.
"""
if isinstance(x, (torch.Tensor, np.ndarray)):
if x.ndim != 1:
return False
else:
return is_dtype_integer(x.dtype)
elif isinstance(x, Iterable):
for item in x:
if not is_integer(item):
return False
return True
else:
return False
is_real(x)
¶
Return True if x
is a real number.
Note that this function does NOT consider booleans as real numbers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
An object whose type is being queried. |
required |
Returns:
Type | Description |
---|---|
bool |
True if |
Source code in evotorch/tools/misc.py
def is_real(x: Any) -> bool:
"""
Return True if `x` is a real number.
Note that this function does NOT consider booleans as real numbers.
Args:
x: An object whose type is being queried.
Returns:
True if `x` is a real number; False otherwise.
"""
if is_bool(x):
return False
elif isinstance(x, Real):
return True
elif isinstance(x, (torch.Tensor, np.ndarray)):
if x.ndim > 0:
return False
else:
return is_dtype_real(x.dtype)
else:
return False
is_real_vector(x)
¶
Return True if x
is a vector consisting of real numbers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
An object whose elements' types are to be queried. |
required |
Returns:
Type | Description |
---|---|
True if the elements of |
Source code in evotorch/tools/misc.py
def is_real_vector(x: Any):
"""
Return True if `x` is a vector consisting of real numbers.
Args:
x: An object whose elements' types are to be queried.
Returns:
True if the elements of `x` are real numbers; False otherwise.
"""
if isinstance(x, (torch.Tensor, np.ndarray)):
if x.ndim != 1:
return False
else:
return is_dtype_real(x.dtype)
elif isinstance(x, Iterable):
for item in x:
if not is_real(item):
return False
return True
else:
return False
is_sequence(x)
¶
Return True if x
is a sequence.
Note that this function considers str
and bytes
as scalars,
not as sequences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
The object whose sequential nature is being queried. |
required |
Returns:
Type | Description |
---|---|
bool |
True if |
Source code in evotorch/tools/misc.py
def is_sequence(x: Any) -> bool:
"""
Return True if `x` is a sequence.
Note that this function considers `str` and `bytes` as scalars,
not as sequences.
Args:
x: The object whose sequential nature is being queried.
Returns:
True if `x` is a sequence; False otherwise.
"""
if isinstance(x, (str, bytes)):
return False
elif isinstance(x, (np.ndarray, torch.Tensor)):
return x.ndim > 0
elif isinstance(x, Iterable):
return True
else:
return False
is_tensor_on_cpu(tensor)
¶
make_I(size=None, out=None, dtype=None, device=None)
¶
Make a new identity matrix (I), or change an existing tensor into one.
The following example creates a 3x3 identity matrix:
identity_matrix = make_I(3, dtype="float32")
The following example changes an already existing square matrix such that its values will store an identity matrix:
make_I(out=existing_tensor)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Optional[int] |
A single integer specifying the length of the target square
matrix. In this context, "length" means both rowwise length
and columnwise length, since the target is a square matrix.
Note that, if the user wishes to fill an existing tensor with
identity values, then |
None |
out |
Optional[torch.Tensor] |
Optionally, the existing tensor whose values will be changed
so that they represent an identity matrix.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the I matrix values |
Source code in evotorch/tools/misc.py
def make_I(
size: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
) -> torch.Tensor:
"""
Make a new identity matrix (I), or change an existing tensor into one.
The following example creates a 3x3 identity matrix:
identity_matrix = make_I(3, dtype="float32")
The following example changes an already existing square matrix such that
its values will store an identity matrix:
make_I(out=existing_tensor)
Args:
size: A single integer specifying the length of the target square
matrix. In this context, "length" means both rowwise length
and columnwise length, since the target is a square matrix.
Note that, if the user wishes to fill an existing tensor with
identity values, then `size` is expected to be left as None.
out: Optionally, the existing tensor whose values will be changed
so that they represent an identity matrix.
If an `out` tensor is given, then `size` is expected as None.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
Returns:
The created or modified tensor after placing the I matrix values
"""
if size is None:
if out is None:
raise ValueError(
" When the `size` argument is missing, `make_I(...)` expects an `out` tensor."
" However, the `out` argument was received as None."
)
size = tuple()
else:
n = int(size)
size = (n, n)
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
out.zero_()
out.fill_diagonal_(1)
return out
make_empty(*size, *, dtype=None, device=None)
¶
Make an empty tensor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Shape of the empty tensor to be created.
expected as multiple positional arguments of integers,
or as a single positional argument containing a tuple of
integers.
Note that when the user wishes to create an |
() |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored. If not specified, "cpu" will be used. |
None |
Returns:
Type | Description |
---|---|
Iterable |
The new empty tensor, which can be a PyTorch tensor or an
|
Source code in evotorch/tools/misc.py
def make_empty(
*size: Size,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
) -> Iterable:
"""
Make an empty tensor.
Args:
size: Shape of the empty tensor to be created.
expected as multiple positional arguments of integers,
or as a single positional argument containing a tuple of
integers.
Note that when the user wishes to create an `ObjectArray`
(i.e. when `dtype` is given as `object`), then the size
is expected as a single integer, or as a single-element
tuple containing an integer (because `ObjectArray` can only
be one-dimensional).
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an `ObjectArray`,
"object" (as string) or `object` or `Any`.
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
device: The device in which the new empty tensor will be stored.
If not specified, "cpu" will be used.
Returns:
The new empty tensor, which can be a PyTorch tensor or an
`ObjectArray`.
"""
from .objectarray import ObjectArray
if (dtype is not None) and is_dtype_object(dtype):
if (device is None) or (str(device) == "cpu"):
if len(size) == 1:
size = size[0]
return ObjectArray(size)
else:
return ValueError(
f"Invalid device for ObjectArray: {repr(device)}. Note: an ObjectArray can only be stored on 'cpu'."
)
else:
kwargs = {}
if dtype is not None:
kwargs["dtype"] = to_torch_dtype(dtype)
if device is not None:
kwargs["device"] = device
return torch.empty(*size, **kwargs)
make_gaussian(*size, *, center=None, stdev=None, symmetric=False, out=None, dtype=None, device=None, generator=None)
¶
Make a new or existing tensor filled by Gaussian distributed values. This function can work only with float dtypes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with Gaussian distributed values. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor instead, then no positional argument is expected. |
() |
center |
Union[float, Iterable[float], torch.Tensor] |
Center point (i.e. mean) of the Gaussian distribution.
Can be a scalar, or a tensor.
If not specified, the center point will be taken as 0.
Note that, if one specifies |
None |
stdev |
Union[float, Iterable[float], torch.Tensor] |
Standard deviation for the Gaussian distributed values.
Can be a scalar, or a tensor.
If not specified, the standard deviation will be taken as 1.
Note that, if one specifies |
None |
symmetric |
bool |
Whether or not the values should be sampled in a symmetric (i.e. antithetic) manner. The default is False. |
False |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by Gaussian distributed
values. If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
generator |
Any |
Pseudo-random number generator to be used when sampling
the values. Can be a |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the Gaussian distributed values. |
Source code in evotorch/tools/misc.py
def make_gaussian(
*size: Size,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
symmetric: bool = False,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by Gaussian distributed values.
This function can work only with float dtypes.
Args:
size: Size of the new tensor to be filled with Gaussian distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
center: Center point (i.e. mean) of the Gaussian distribution.
Can be a scalar, or a tensor.
If not specified, the center point will be taken as 0.
Note that, if one specifies `center`, then `stdev` is also
expected to be explicitly specified.
stdev: Standard deviation for the Gaussian distributed values.
Can be a scalar, or a tensor.
If not specified, the standard deviation will be taken as 1.
Note that, if one specifies `stdev`, then `center` is also
expected to be explicitly specified.
symmetric: Whether or not the values should be sampled in a
symmetric (i.e. antithetic) manner.
The default is False.
out: Optionally, the tensor to be filled by Gaussian distributed
values. If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
generator: Pseudo-random number generator to be used when sampling
the values. Can be a `torch.Generator`, or an object with
a `generator` attribute (such as `Problem`).
If left as None, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the Gaussian
distributed values.
"""
scalar_requested = _scalar_requested(*size)
if scalar_requested:
size = (1,)
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
gen_kwargs = _generator_kwargs(generator)
if symmetric:
leftmost_dim = out.shape[0]
if (leftmost_dim % 2) != 0:
raise ValueError(
f"Symmetric sampling cannot be done if the leftmost dimension of the target tensor is odd."
f" The shape of the target tensor is: {repr(out.shape)}."
)
out[0::2, ...].normal_(**gen_kwargs)
out[1::2, ...] = out[0::2, ...]
out[1::2, ...] *= -1
else:
out.normal_(**gen_kwargs)
if (center is None) and (stdev is None):
pass # do nothing
elif (center is not None) and (stdev is not None):
stdev = torch.as_tensor(stdev, dtype=out.dtype, device=out.device)
out *= stdev
center = torch.as_tensor(center, dtype=out.dtype, device=out.device)
out += center
else:
raise ValueError(
f"Please either specify none of `stdev` and `center`, or both of them."
f" Currently, `center` is {center}"
f" and `stdev` is {stdev}."
)
if scalar_requested:
out = out[0]
return out
make_nan(*size, *, out=None, dtype=None, device=None)
¶
Make a new tensor filled with NaN, or fill an existing tensor with NaN.
The following example creates a float32 tensor filled with NaN values, of shape (3, 5):
nan_values = make_nan(3, 5, dtype="float32")
The following example fills an existing tensor with NaNs.
make_nan(out=existing_tensor)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with NaNs. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor with NaN values, then no positional argument is expected. |
() |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by NaN values.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing NaN values. |
Source code in evotorch/tools/misc.py
def make_nan(
*size: Size,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
) -> torch.Tensor:
"""
Make a new tensor filled with NaN, or fill an existing tensor with NaN.
The following example creates a float32 tensor filled with NaN values,
of shape (3, 5):
nan_values = make_nan(3, 5, dtype="float32")
The following example fills an existing tensor with NaNs.
make_nan(out=existing_tensor)
Args:
size: Size of the new tensor to be filled with NaNs.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
NaN values, then no positional argument is expected.
out: Optionally, the tensor to be filled by NaN values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
Returns:
The created or modified tensor after placing NaN values.
"""
if _scalar_requested(*size):
return _scalar_tensor(float("nan"), out=out, dtype=dtype, device=device)
else:
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
out[:] = float("nan")
return out
make_ones(*size, *, out=None, dtype=None, device=None)
¶
Make a new tensor filled with 1, or fill an existing tensor with 1.
The following example creates a float32 tensor filled with 1 values, of shape (3, 5):
zero_values = make_ones(3, 5, dtype="float32")
The following example fills an existing tensor with 1s:
make_ones(out=existing_tensor)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with 1. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor with 1 values, then no positional argument is expected. |
() |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by 1 values.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing 1 values. |
Source code in evotorch/tools/misc.py
def make_ones(
*size: Size,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
) -> torch.Tensor:
"""
Make a new tensor filled with 1, or fill an existing tensor with 1.
The following example creates a float32 tensor filled with 1 values,
of shape (3, 5):
zero_values = make_ones(3, 5, dtype="float32")
The following example fills an existing tensor with 1s:
make_ones(out=existing_tensor)
Args:
size: Size of the new tensor to be filled with 1.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
1 values, then no positional argument is expected.
out: Optionally, the tensor to be filled by 1 values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
Returns:
The created or modified tensor after placing 1 values.
"""
if _scalar_requested(*size):
return _scalar_tensor(1, out=out, dtype=dtype, device=device)
else:
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
out[:] = 1
return out
make_randint(*size, *, n, out=None, dtype=None, device=None, generator=None)
¶
Make a new or existing tensor filled by random integers.
The integers are uniformly distributed within [0 ... n-1]
.
This function can be used with integer or float dtypes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with uniformly distributed values. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor instead, then no positional argument is expected. |
() |
n |
Union[int, float, torch.Tensor] |
Number of choice(s) for integer sampling.
The lowest possible value will be 0, and the highest possible
value will be n - 1.
|
required |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by the random integers.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "int64") or a PyTorch dtype
(e.g. torch.int64).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
generator |
Any |
Pseudo-random number generator to be used when sampling
the values. Can be a |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the uniformly distributed values. |
Source code in evotorch/tools/misc.py
def make_randint(
*size: Size,
n: Union[int, float, torch.Tensor],
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by random integers.
The integers are uniformly distributed within `[0 ... n-1]`.
This function can be used with integer or float dtypes.
Args:
size: Size of the new tensor to be filled with uniformly distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
n: Number of choice(s) for integer sampling.
The lowest possible value will be 0, and the highest possible
value will be n - 1.
`n` can be a scalar, or a tensor.
out: Optionally, the tensor to be filled by the random integers.
If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "int64") or a PyTorch dtype
(e.g. torch.int64).
If `dtype` is not specified, torch.int64 will be used.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
generator: Pseudo-random number generator to be used when sampling
the values. Can be a `torch.Generator`, or an object with
a `generator` attribute (such as `Problem`).
If left as None, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the uniformly
distributed values.
"""
scalar_requested = _scalar_requested(*size)
if scalar_requested:
size = (1,)
if (dtype is None) and (out is None):
dtype = torch.int64
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
gen_kwargs = _generator_kwargs(generator)
out.random_(**gen_kwargs)
out %= n
if scalar_requested:
out = out[0]
return out
make_tensor(data, *, dtype=None, device=None, read_only=False)
¶
Make a new tensor.
This function can be used to create PyTorch tensors, or ObjectArray instances with or without read-only behavior.
The following example creates a 2-dimensional PyTorch tensor:
my_tensor = make_tensor(
[[1, 2], [3, 4]],
dtype="float32", # alternatively, torch.float32
device="cpu",
)
The following example creates an ObjectArray from a list that contains arbitrary data:
my_obj_tensor = make_tensor(["a_string", (1, 2)], dtype=object)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
Any |
The data to be converted to a tensor.
If one wishes to create a PyTorch tensor, this can be anything
that can be stored by a PyTorch tensor.
If one wishes to create an |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32"), or a PyTorch dtype
(e.g. torch.float32), or |
None |
device |
Union[str, torch.device] |
The device in which the tensor will be stored.
If |
None |
read_only |
bool |
Whether or not the created tensor will be read-only. By default, this is False. |
False |
Returns:
Type | Description |
---|---|
Iterable |
A PyTorch tensor or an ObjectArray. |
Source code in evotorch/tools/misc.py
def make_tensor(
data: Any,
*,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
read_only: bool = False,
) -> Iterable:
"""
Make a new tensor.
This function can be used to create PyTorch tensors, or ObjectArray
instances with or without read-only behavior.
The following example creates a 2-dimensional PyTorch tensor:
my_tensor = make_tensor(
[[1, 2], [3, 4]],
dtype="float32", # alternatively, torch.float32
device="cpu",
)
The following example creates an ObjectArray from a list that contains
arbitrary data:
my_obj_tensor = make_tensor(["a_string", (1, 2)], dtype=object)
Args:
data: The data to be converted to a tensor.
If one wishes to create a PyTorch tensor, this can be anything
that can be stored by a PyTorch tensor.
If one wishes to create an `ObjectArray` and therefore passes
`dtype=object`, then the provided `data` is expected as an
`Iterable`.
dtype: Optionally a string (e.g. "float32"), or a PyTorch dtype
(e.g. torch.float32), or `object` or "object" (as a string)
or `Any` if one wishes to create an `ObjectArray`.
If `dtype` is not specified, it will be assumed that the user
wishes to create a PyTorch tensor (not an `ObjectArray`) and
then `dtype` will be inferred from the provided `data`
(according to the default behavior of PyTorch).
device: The device in which the tensor will be stored.
If `device` is not specified, it will be understood from the
given `data` (according to the default behavior of PyTorch).
read_only: Whether or not the created tensor will be read-only.
By default, this is False.
Returns:
A PyTorch tensor or an ObjectArray.
"""
from .objectarray import ObjectArray
from .readonlytensor import as_read_only_tensor
if (dtype is not None) and is_dtype_object(dtype):
data = list(data)
n = len(data)
result = ObjectArray(n)
result[:] = data
else:
kwargs = {}
if dtype is not None:
kwargs["dtype"] = to_torch_dtype(dtype)
if device is not None:
kwargs["device"] = device
result = torch.tensor(data, **kwargs)
if read_only:
result = as_read_only_tensor(result)
return result
make_uniform(*size, *, lb=None, ub=None, out=None, dtype=None, device=None, generator=None)
¶
Make a new or existing tensor filled by uniformly distributed values. Both lower and upper bounds are inclusive. This function can work with both float and int dtypes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with uniformly distributed values. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor instead, then no positional argument is expected. |
() |
lb |
Union[float, Iterable[float], torch.Tensor] |
Lower bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the lower bound will be taken as 0.
Note that, if one specifies |
None |
ub |
Union[float, Iterable[float], torch.Tensor] |
Upper bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the upper bound will be taken as 1.
Note that, if one specifies |
None |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by uniformly distributed
values. If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
generator |
Any |
Pseudo-random number generator to be used when sampling
the values. Can be a |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the uniformly distributed values. |
Source code in evotorch/tools/misc.py
def make_uniform(
*size: Size,
lb: Optional[RealOrVector] = None,
ub: Optional[RealOrVector] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by uniformly distributed values.
Both lower and upper bounds are inclusive.
This function can work with both float and int dtypes.
Args:
size: Size of the new tensor to be filled with uniformly distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
lb: Lower bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the lower bound will be taken as 0.
Note that, if one specifies `lb`, then `ub` is also expected to
be explicitly specified.
ub: Upper bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the upper bound will be taken as 1.
Note that, if one specifies `ub`, then `lb` is also expected to
be explicitly specified.
out: Optionally, the tensor to be filled by uniformly distributed
values. If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
generator: Pseudo-random number generator to be used when sampling
the values. Can be a `torch.Generator`, or an object with
a `generator` attribute (such as `Problem`).
If left as None, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the uniformly
distributed values.
"""
scalar_requested = _scalar_requested(*size)
if scalar_requested:
size = (1,)
def _invalid_bound_args():
raise ValueError(
f"Expected both `lb` and `ub` as None, or both `lb` and `ub` as not None."
f" It appears that one of them is None, while the other is not."
f" lb: {repr(lb)}."
f" ub: {repr(ub)}."
)
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
gen_kwargs = _generator_kwargs(generator)
def _cast_bounds():
nonlocal lb, ub
lb = torch.as_tensor(lb, dtype=out.dtype, device=out.device)
ub = torch.as_tensor(ub, dtype=out.dtype, device=out.device)
if out.dtype in (torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64):
out.random_(**gen_kwargs)
if (lb is None) and (ub is None):
out %= 2
elif (lb is not None) and (ub is not None):
_cast_bounds()
diff = (ub - lb) + 1
out -= lb
out %= diff
out += lb
else:
_invalid_bound_args()
else:
out.uniform_(**gen_kwargs)
if (lb is None) and (ub is None):
pass # nothing to do
elif (lb is not None) and (ub is not None):
_cast_bounds()
diff = ub - lb
out *= diff
out += lb
else:
_invalid_bound_args()
if scalar_requested:
out = out[0]
return out
make_zeros(*size, *, out=None, dtype=None, device=None)
¶
Make a new tensor filled with 0, or fill an existing tensor with 0.
The following example creates a float32 tensor filled with 0 values, of shape (3, 5):
zero_values = make_zeros(3, 5, dtype="float32")
The following example fills an existing tensor with 0s:
make_zeros(out=existing_tensor)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with 0. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor with 0 values, then no positional argument is expected. |
() |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by 0 values.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing 0 values. |
Source code in evotorch/tools/misc.py
def make_zeros(
*size: Size,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
) -> torch.Tensor:
"""
Make a new tensor filled with 0, or fill an existing tensor with 0.
The following example creates a float32 tensor filled with 0 values,
of shape (3, 5):
zero_values = make_zeros(3, 5, dtype="float32")
The following example fills an existing tensor with 0s:
make_zeros(out=existing_tensor)
Args:
size: Size of the new tensor to be filled with 0.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
0 values, then no positional argument is expected.
out: Optionally, the tensor to be filled by 0 values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified, the default choice of
`torch.empty(...)` is used, that is, `torch.float32`.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new tensor will be stored.
If not specified, "cpu" will be used.
If an `out` tensor is specified, then `device` is expected
as None.
Returns:
The created or modified tensor after placing 0 values.
"""
if _scalar_requested(*size):
return _scalar_tensor(0, out=out, dtype=dtype, device=device)
else:
out = _out_tensor(*size, out=out, dtype=dtype, device=device)
out.zero_()
return out
modify_tensor(original, target, lb=None, ub=None, max_change=None, in_place=False)
¶
Return the modified version of the original tensor, with bounds checking.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
original |
Tensor |
The original tensor. |
required |
target |
Tensor |
The target tensor which contains the values to replace the old ones in the original tensor. |
required |
lb |
Union[float, torch.Tensor] |
The lower bound(s), as a scalar or as an tensor. Values below these bounds are clipped in the resulting tensor. None means -inf. |
None |
ub |
Union[float, torch.Tensor] |
The upper bound(s), as a scalar or as an tensor. Value above these bounds are clipped in the resulting tensor. None means +inf. |
None |
max_change |
Union[float, torch.Tensor] |
The ratio of allowed change.
In more details, when given as a real number r,
modifications are allowed only within
|
None |
in_place |
bool |
Provide this as True if you wish the modification to be done within the original tensor. The default value of this argument is False, which means, the original tensor is not changed, and its modified version is returned as an independent copy. |
False |
Returns:
Type | Description |
---|---|
Tensor |
The modified tensor. |
Source code in evotorch/tools/misc.py
@torch.no_grad()
def modify_tensor(
original: torch.Tensor,
target: torch.Tensor,
lb: Optional[Union[float, torch.Tensor]] = None,
ub: Optional[Union[float, torch.Tensor]] = None,
max_change: Optional[Union[float, torch.Tensor]] = None,
in_place: bool = False,
) -> torch.Tensor:
"""Return the modified version of the original tensor, with bounds checking.
Args:
original: The original tensor.
target: The target tensor which contains the values to replace the
old ones in the original tensor.
lb: The lower bound(s), as a scalar or as an tensor.
Values below these bounds are clipped in the resulting tensor.
None means -inf.
ub: The upper bound(s), as a scalar or as an tensor.
Value above these bounds are clipped in the resulting tensor.
None means +inf.
max_change: The ratio of allowed change.
In more details, when given as a real number r,
modifications are allowed only within
``[original-(r*abs(original)) ... original+(r*abs(original))]``.
Modifications beyond this interval are clipped.
This argument can also be left as None if no such limitation
is needed.
in_place: Provide this as True if you wish the modification to be
done within the original tensor. The default value of this
argument is False, which means, the original tensor is not
changed, and its modified version is returned as an independent
copy.
Returns:
The modified tensor.
"""
if (lb is None) and (ub is None) and (max_change is None):
# If there is no restriction regarding how the tensor
# should be modified (no lb, no ub, no max_change),
# then we simply use the target values
# themselves for modifying the tensor.
if in_place:
original[:] = target
return original
else:
return target
else:
# If there are some restriction regarding how the tensor
# should be modified, then we turn to the following
# operations
def convert_to_tensor(x, tensorname: str):
if isinstance(x, torch.Tensor):
converted = x
else:
converted = torch.as_tensor(x, dtype=original.dtype, device=original.device)
if converted.ndim == 0 or converted.shape == original.shape:
return converted
else:
raise IndexError(
f"Argument {tensorname}: shape mismatch."
f" Shape of the original tensor: {original.shape}."
f" Shape of {tensorname}: {converted.shape}."
)
if lb is None:
# If lb is None, then it should be taken as -inf
lb = convert_to_tensor(float("-inf"), "lb")
else:
lb = convert_to_tensor(lb, "lb")
if ub is None:
# If ub is None, then it should be taken as +inf
ub = convert_to_tensor(float("inf"), "ub")
else:
ub = convert_to_tensor(ub, "ub")
if max_change is not None:
# If max_change is provided as something other than None,
# then we update the lb and ub so that they are tight
# enough to satisfy the max_change restriction.
max_change = convert_to_tensor(max_change, "max_change")
allowed_amounts = torch.abs(original) * max_change
allowed_lb = original - allowed_amounts
allowed_ub = original + allowed_amounts
lb = torch.max(lb, allowed_lb)
ub = torch.min(ub, allowed_ub)
## If in_place is given as True, the clipping (that we are about
## to perform), should be in-place.
# more_config = {}
# if in_place:
# more_config['out'] = original
#
## Return the clipped version of the target values
# return torch.clamp(target, lb, ub, **more_config)
result = torch.max(target, lb)
result = torch.min(result, ub)
if in_place:
original[:] = result
return original
else:
return result
numpy_copy(x, dtype)
¶
Return a numpy copy of the given iterable.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Iterable |
Any Iterable whose numpy copy will be returned. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The desired dtype. Can be given as a numpy dtype, as a torch dtype, or a native dtype (e.g. int, float), or as a string (e.g. "float32"). |
required |
Returns:
Type | Description |
---|---|
ndarray |
The numpy copy of the original iterable object. |
Source code in evotorch/tools/misc.py
def numpy_copy(x: Iterable, dtype: DType) -> np.ndarray:
"""
Return a numpy copy of the given iterable.
Args:
x: Any Iterable whose numpy copy will be returned.
dtype: The desired dtype. Can be given as a numpy dtype,
as a torch dtype, or a native dtype (e.g. int, float),
or as a string (e.g. "float32").
Returns:
The numpy copy of the original iterable object.
"""
dtype = to_numpy_dtype(dtype)
if isinstance(x, torch.Tensor):
x = x.cpu()
return np.array(x, dtype=dtype)
split_workload(workload, num_actors)
¶
Split a workload among actors.
By "workload" what is meant is the total amount of a work, this amount being expressed by an integer. For example, if the "work" is the evaluation of a population, the "workload" would usually be the population size.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
workload |
int |
Total amount of work, as an integer. |
required |
num_actors |
int |
Number of actors (i.e. remote workers) among which the workload will be distributed. |
required |
Returns:
Type | Description |
---|---|
list |
A list of integers. The i-th item of the returned list expresses the suggested workload for the i-th actor. |
Source code in evotorch/tools/misc.py
def split_workload(workload: int, num_actors: int) -> list:
"""
Split a workload among actors.
By "workload" what is meant is the total amount of a work,
this amount being expressed by an integer.
For example, if the "work" is the evaluation of a population,
the "workload" would usually be the population size.
Args:
workload: Total amount of work, as an integer.
num_actors: Number of actors (i.e. remote workers) among
which the workload will be distributed.
Returns:
A list of integers. The i-th item of the returned list
expresses the suggested workload for the i-th actor.
"""
base_workload = workload // num_actors
extra_workload = workload % num_actors
result = [base_workload] * num_actors
for i in range(extra_workload):
result[i] += 1
return result
stdev_from_radius(radius, solution_length)
¶
Get elementwise standard deviation from a given radius.
Sometimes, for a distribution-based search algorithm, the user might
choose to configure the initial coverage area of the search distribution
not via standard deviation, but via a radius value, as was done in the
study of Toklu et al. (2020).
This function takes the desired radius value and the solution length of
the problem at hand, and returns the elementwise standard deviation value.
Let us name this returned standard deviation value as s
.
When a new Gaussian distribution is constructed such that its initial
standard deviation is [s, s, s, ...]
(the length of this vector being
equal to the solution length), this constructed distribution's radius
corresponds with the desired radius.
Here, the "radius" of a Gaussian distribution is defined as the norm
of the standard deviation vector. In the case of a standard normal
distribution, this radius formulation serves as a simplified approximation
to E[||Normal(0, I)||]
(for which a closer approximation is used in
the study of Hansen & Ostermeier (2001)).
Reference:
Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
ClipUp: A Simple and Powerful Optimizer
for Distribution-based Policy Evolution.
Parallel Problem Solving from Nature (PPSN 2020).
Nikolaus Hansen, Andreas Ostermeier (2001).
Completely Derandomized Self-Adaptation in Evolution Strategies.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
radius |
float |
The radius whose elementwise standard deviation counterpart will be returned. |
required |
solution_length |
int |
Length of a solution for the problem at hand. |
required |
Returns:
Type | Description |
---|---|
float |
An elementwise standard deviation value |
Source code in evotorch/tools/misc.py
def stdev_from_radius(radius: float, solution_length: int) -> float:
"""
Get elementwise standard deviation from a given radius.
Sometimes, for a distribution-based search algorithm, the user might
choose to configure the initial coverage area of the search distribution
not via standard deviation, but via a radius value, as was done in the
study of Toklu et al. (2020).
This function takes the desired radius value and the solution length of
the problem at hand, and returns the elementwise standard deviation value.
Let us name this returned standard deviation value as `s`.
When a new Gaussian distribution is constructed such that its initial
standard deviation is `[s, s, s, ...]` (the length of this vector being
equal to the solution length), this constructed distribution's radius
corresponds with the desired radius.
Here, the "radius" of a Gaussian distribution is defined as the norm
of the standard deviation vector. In the case of a standard normal
distribution, this radius formulation serves as a simplified approximation
to `E[||Normal(0, I)||]` (for which a closer approximation is used in
the study of Hansen & Ostermeier (2001)).
Reference:
Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
ClipUp: A Simple and Powerful Optimizer
for Distribution-based Policy Evolution.
Parallel Problem Solving from Nature (PPSN 2020).
Nikolaus Hansen, Andreas Ostermeier (2001).
Completely Derandomized Self-Adaptation in Evolution Strategies.
Args:
radius: The radius whose elementwise standard deviation counterpart
will be returned.
solution_length: Length of a solution for the problem at hand.
Returns:
An elementwise standard deviation value `s`, such that a Gaussian
distribution constructed with the standard deviation `[s, s, s, ...]`
has the desired radius.
"""
radius = float(radius)
solution_length = int(solution_length)
return math.sqrt((radius**2) / solution_length)
to_numpy_dtype(dtype)
¶
Convert the given string or the given PyTorch dtype to a numpy dtype. If the argument is already a numpy dtype, then the argument is returned as it is.
Returns:
Type | Description |
---|---|
dtype |
The dtype, converted to a numpy dtype. |
Source code in evotorch/tools/misc.py
def to_numpy_dtype(dtype: DType) -> np.dtype:
"""
Convert the given string or the given PyTorch dtype to a numpy dtype.
If the argument is already a numpy dtype, then the argument is returned
as it is.
Returns:
The dtype, converted to a numpy dtype.
"""
if isinstance(dtype, torch.dtype):
return torch.tensor([], dtype=dtype).numpy().dtype
elif is_dtype_object(dtype):
return np.dtype(object)
elif isinstance(dtype, np.dtype):
return dtype
else:
return np.dtype(dtype)
to_stdev_init(*, solution_length, stdev_init=None, radius_init=None)
¶
Ask for both standard deviation and radius, return the standard deviation.
It is very common among the distribution-based search algorithms to ask for both standard deviation and for radius for initializing the coverage area of the search distribution. During their initialization phases, these algorithms must check which one the user provided (radius or standard deviation), and return the result as the standard deviation so that a Gaussian distribution can easily be constructed.
This function serves as a helper function for such search algorithms by performing these actions:
- If the user provided a standard deviation and not a radius, then this provided standard deviation is simply returned.
- If the user provided a radius and not a standard deviation, then this provided radius is converted to its standard deviation counterpart, and then returned.
- If both standard deviation and radius are missing, or they are both given at the same time, then an error is raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solution_length |
int |
Length of a solution for the problem at hand. |
required |
stdev_init |
Union[float, Iterable[float], torch.Tensor] |
Standard deviation. If one wishes to provide a radius
instead, then |
None |
radius_init |
Union[float, Iterable[float], torch.Tensor] |
Radius. If one wishes to provide a standard deviation
instead, then |
None |
Returns:
Type | Description |
---|---|
Union[float, Iterable[float], torch.Tensor] |
The standard deviation for the search distribution to be constructed. |
Source code in evotorch/tools/misc.py
def to_stdev_init(
*,
solution_length: int,
stdev_init: Optional[RealOrVector] = None,
radius_init: Optional[RealOrVector] = None,
) -> RealOrVector:
"""
Ask for both standard deviation and radius, return the standard deviation.
It is very common among the distribution-based search algorithms to ask
for both standard deviation and for radius for initializing the coverage
area of the search distribution. During their initialization phases,
these algorithms must check which one the user provided (radius or
standard deviation), and return the result as the standard deviation
so that a Gaussian distribution can easily be constructed.
This function serves as a helper function for such search algorithms
by performing these actions:
- If the user provided a standard deviation and not a radius, then this
provided standard deviation is simply returned.
- If the user provided a radius and not a standard deviation, then this
provided radius is converted to its standard deviation counterpart,
and then returned.
- If both standard deviation and radius are missing, or they are both
given at the same time, then an error is raised.
Args:
solution_length: Length of a solution for the problem at hand.
stdev_init: Standard deviation. If one wishes to provide a radius
instead, then `stdev_init` is expected as None.
radius_init: Radius. If one wishes to provide a standard deviation
instead, then `radius_init` is expected as None.
Returns:
The standard deviation for the search distribution to be constructed.
"""
if (stdev_init is not None) and (radius_init is None):
return stdev_init
elif (stdev_init is None) and (radius_init is not None):
return stdev_from_radius(radius_init, solution_length)
elif (stdev_init is None) and (radius_init is None):
raise ValueError(
"Received both `stdev_init` and `radius_init` as None."
" Please provide a value either for `stdev_init` or for `radius_init`."
)
else:
raise ValueError(
"Found both `stdev_init` and `radius_init` with values other than None."
" Please provide a value either for `stdev_init` or for `radius_init`, but not for both."
)
to_torch_dtype(dtype)
¶
Convert the given string or the given numpy dtype to a PyTorch dtype. If the argument is already a PyTorch dtype, then the argument is returned as it is.
Returns:
Type | Description |
---|---|
dtype |
The dtype, converted to a PyTorch dtype. |
Source code in evotorch/tools/misc.py
def to_torch_dtype(dtype: DType) -> torch.dtype:
"""
Convert the given string or the given numpy dtype to a PyTorch dtype.
If the argument is already a PyTorch dtype, then the argument is returned
as it is.
Returns:
The dtype, converted to a PyTorch dtype.
"""
if isinstance(dtype, str) and hasattr(torch, dtype):
attrib_within_torch = getattr(torch, dtype)
else:
attrib_within_torch = None
if isinstance(attrib_within_torch, torch.dtype):
return attrib_within_torch
elif isinstance(dtype, torch.dtype):
return dtype
elif dtype is Any or dtype is object:
raise TypeError(f"Cannot make a numeric tensor with dtype {repr(dtype)}")
else:
return torch.from_numpy(np.array([], dtype=dtype)).dtype
objectarray
¶
This module contains the ObjectArray class, which is an array-like data structure with an interface similar to PyTorch tensors, but with an ability to store arbitrary type of data (not just numbers).
ObjectArray (Sequence)
¶
An object container with an interface similar to PyTorch tensors.
It is strictly one-dimensional, and supports advanced indexing and slicing operations supported by PyTorch tensors.
An ObjectArray can store None
values, strings, numbers, booleans,
lists, sets, dictionaries, PyTorch tensors, and numpy arrays.
When a container (such as a list, dictionary, set, is placed into an ObjectArray, an immutable clone of this container is first created, and then this newly created immutable clone gets stored within the ObjectArray. This behavior is to prevent accidental modification of the stored data.
When a numeric array (such as a PyTorch tensor or a numpy array with a
numeric dtype) is placed into an ObjectArray, the target ObjectArray
first checks if the numeric array is read-only. If the numeric array
is indeed read-only, then the array is put into the ObjectArray as it
is. If the array is not read-only, then a read-only clone of the
original numeric array is first created, and then this clone gets
stored by the ObjectArray. This behavior has the following implications:
(i) even when an ObjectArray is shared by multiple components of the
program, the risk of accidental modification of the stored data through
this shared ObjectArray is significantly reduced as the stored numeric
arrays are read-only;
(ii) although not recommended, one could still forcefully modify the
numeric arrays stored by an ObjectArray by explicitly casting them as
mutable arrays
(in the case of a numpy array, one could forcefully set the WRITEABLE
flag, and, in the case of a ReadOnlyTensor, one could forcefully cast it
as a regular PyTorch tensor);
(iii) if an already read-only array x
is placed into an ObjectArray,
but x
shares its memory with a mutable array y
, then the contents
of the ObjectArray can be affected by modifying y
.
The implication (ii) is demonstrated as follows:
objs = ObjectArray(1) # a single-element ObjectArray
# Place a numpy array into objs:
objs[0] = np.array([1, 2, 3], dtype=float)
# At this point, objs[0] is a read-only numpy array.
# objs[0] *= 2 # <- Not allowed
# Possible but NOT recommended:
objs.flags["WRITEABLE"] = True
objs[0] *= 2
The implication (iii) is demonstrated as follows:
objs = ObjectArray(1) # a single-element ObjectArray
# Make a new mutable numpy array
y = np.array([1, 2, 3], dtype=float)
# Make a read-only view to y:
x = y[:]
x.flags["WRITEABLE"] = False
# Place x into objs.
objs[0] = x
# At this point, objs[0] is a read-only numpy array.
# objs[0] *= 2 # <- Not allowed
# During the operation of setting its 0-th item, the ObjectArray
# `objs` did not clone `x` because `x` was already read-only.
# However, the contents of `x` could actually be modified because
# `x` shares its memory with the mutable array `y`.
# Possible but NOT recommended:
y *= 2 # This affects both x and objs!
When a numpy array of dtype object is placed into an ObjectArray, a read-only ObjectArray copy of the original array will first be created, and then, this newly created ObjectArray will be stored by the outer ObjectArray.
An ObjectArray itself has a read-only mode, so that, in addition to its stored data, the ObjectArray itself can be protected against undesired modifications.
An interesting feature of PyTorch: if one slices a tensor A and the result is a new tensor B, and if B is sharing storage memory with A, then A.storage().data_ptr() and B.storage().data_ptr() will return the same pointer. This means, one can compare the storage pointers of A and B and see whether or not the two are sharing memory. ObjectArray was designed to have this exact behavior, so that one can understand if two ObjectArray instances are sharing memory. Note that NumPy does NOT have such a behavior. In more details, a NumPy array C and a NumPy array D could report different pointers even when D was created via a basic slicing operation on C.
Source code in evotorch/tools/objectarray.py
class ObjectArray(Sequence):
"""
An object container with an interface similar to PyTorch tensors.
It is strictly one-dimensional, and supports advanced indexing and
slicing operations supported by PyTorch tensors.
An ObjectArray can store `None` values, strings, numbers, booleans,
lists, sets, dictionaries, PyTorch tensors, and numpy arrays.
When a container (such as a list, dictionary, set, is placed into an
ObjectArray, an immutable clone of this container is first created, and
then this newly created immutable clone gets stored within the
ObjectArray. This behavior is to prevent accidental modification of the
stored data.
When a numeric array (such as a PyTorch tensor or a numpy array with a
numeric dtype) is placed into an ObjectArray, the target ObjectArray
first checks if the numeric array is read-only. If the numeric array
is indeed read-only, then the array is put into the ObjectArray as it
is. If the array is not read-only, then a read-only clone of the
original numeric array is first created, and then this clone gets
stored by the ObjectArray. This behavior has the following implications:
(i) even when an ObjectArray is shared by multiple components of the
program, the risk of accidental modification of the stored data through
this shared ObjectArray is significantly reduced as the stored numeric
arrays are read-only;
(ii) although not recommended, one could still forcefully modify the
numeric arrays stored by an ObjectArray by explicitly casting them as
mutable arrays
(in the case of a numpy array, one could forcefully set the WRITEABLE
flag, and, in the case of a ReadOnlyTensor, one could forcefully cast it
as a regular PyTorch tensor);
(iii) if an already read-only array `x` is placed into an ObjectArray,
but `x` shares its memory with a mutable array `y`, then the contents
of the ObjectArray can be affected by modifying `y`.
The implication (ii) is demonstrated as follows:
```python
objs = ObjectArray(1) # a single-element ObjectArray
# Place a numpy array into objs:
objs[0] = np.array([1, 2, 3], dtype=float)
# At this point, objs[0] is a read-only numpy array.
# objs[0] *= 2 # <- Not allowed
# Possible but NOT recommended:
objs.flags["WRITEABLE"] = True
objs[0] *= 2
```
The implication (iii) is demonstrated as follows:
```python
objs = ObjectArray(1) # a single-element ObjectArray
# Make a new mutable numpy array
y = np.array([1, 2, 3], dtype=float)
# Make a read-only view to y:
x = y[:]
x.flags["WRITEABLE"] = False
# Place x into objs.
objs[0] = x
# At this point, objs[0] is a read-only numpy array.
# objs[0] *= 2 # <- Not allowed
# During the operation of setting its 0-th item, the ObjectArray
# `objs` did not clone `x` because `x` was already read-only.
# However, the contents of `x` could actually be modified because
# `x` shares its memory with the mutable array `y`.
# Possible but NOT recommended:
y *= 2 # This affects both x and objs!
```
When a numpy array of dtype object is placed into an ObjectArray,
a read-only ObjectArray copy of the original array will first be
created, and then, this newly created ObjectArray will be stored
by the outer ObjectArray.
An ObjectArray itself has a read-only mode, so that, in addition to its
stored data, the ObjectArray itself can be protected against undesired
modifications.
An interesting feature of PyTorch: if one slices a tensor A and the
result is a new tensor B, and if B is sharing storage memory with A,
then A.storage().data_ptr() and B.storage().data_ptr() will return
the same pointer. This means, one can compare the storage pointers of
A and B and see whether or not the two are sharing memory.
ObjectArray was designed to have this exact behavior, so that one
can understand if two ObjectArray instances are sharing memory.
Note that NumPy does NOT have such a behavior. In more details,
a NumPy array C and a NumPy array D could report different pointers
even when D was created via a basic slicing operation on C.
"""
def __init__(
self,
size: Optional[Size] = None,
*,
slice_of: Optional[tuple] = None,
):
"""
`__init__(...)`: Instantiate a new ObjectArray.
Args:
size: Length of the ObjectArray. If this argument is present and
is an integer `n`, then the resulting ObjectArray will be
of length `n`, and will be filled with `None` values.
This argument cannot be used together with the keyword
argument `slice_of`.
slice_of: Optionally a tuple in the form
`(original_object_tensor, slice_info)`.
When this argument is present, then the resulting ObjectArray
will be a slice of the given `original_object_tensor` (which
is expected as an ObjectArray instance). `slice_info` is
either a `slice` instance, or a sequence of integers.
The resulting ObjectArray might be a view of
`original_object_tensor` (i.e. it might share its memory with
`original_object_tensor`).
This keyword argument cannot be used together with the
argument `size`.
"""
if size is not None and slice_of is not None:
raise ValueError("Expected either `size` argument or `slice_of` argument, but got both.")
elif size is None and slice_of is None:
raise ValueError("Expected either `size` argument or `slice_of` argument, but got none.")
elif size is not None:
if not is_sequence(size):
length = size
elif isinstance(size, (np.ndarray, torch.Tensor)) and (size.ndim > 1):
raise ValueError(f"Invalid size: {size}")
else:
[length] = size
length = int(length)
self._indices = torch.arange(length, dtype=torch.int64)
self._objects = [None] * length
elif slice_of is not None:
source: ObjectArray
source, slicing = slice_of
if not isinstance(source, ObjectArray):
raise TypeError(
f"`slice_of`: The first element was expected as an ObjectArray."
f" But it is of type {repr(type(source))}"
)
if isinstance(slicing, tuple) or is_integer(slicing):
raise TypeError(f"Invalid slice: {slicing}")
self._indices = source._indices[slicing]
self._objects = source._objects
if self._indices.storage().data_ptr() != source._indices.storage().data_ptr():
self._objects = clone(self._objects)
self._device = torch.device("cpu")
self._read_only = False
@property
def shape(self) -> Size:
"""Shape of the ObjectArray, as a PyTorch Size tuple."""
return self._indices.shape
def size(self) -> Size:
"""
Get the size of the ObjectArray, as a PyTorch Size tuple.
Returns:
The size (i.e. the shape) of the ObjectArray.
"""
return self._indices.size()
@property
def ndim(self) -> int:
"""
Number of dimensions handled by the ObjectArray.
This is equivalent to getting the length of the size tuple.
"""
return self._indices.ndim
def dim(self) -> int:
"""
Get the number of dimensions handled by the ObjectArray.
This is equivalent to getting the length of the size tuple.
Returns:
The number of dimensions, as an integer.
"""
return self._indices.dim()
def numel(self) -> int:
"""
Number of elements stored by the ObjectArray.
Returns:
The number of elements, as an integer.
"""
return self._indices.numel()
def repeat(self, *sizes) -> "ObjectArray":
"""
Repeat the contents of this ObjectArray.
For example, if we have an ObjectArray `objs` which stores
`["hello", "world"]`, the following line:
objs.repeat(3)
will result in an ObjectArray which stores:
`["hello", "world", "hello", "world", "hello", "world"]`
Args:
sizes: Although this argument is named `sizes` to be compatible
with PyTorch, what is expected here is a single positional
argument, as a single integer, or as a single-element
tuple.
The given integer (which can be the argument itself, or
the integer within the given single-element tuple),
specifies how many times the stored sequence will be
repeated.
Returns:
A new ObjectArray which repeats the original one's values
"""
if len(sizes) != 1:
type_name = type(self).__name__
raise ValueError(
f"The `repeat(...)` method of {type_name} expects exactly one positional argument."
f" This is because {type_name} supports only 1-dimensional storage."
f" The received positional arguments are: {sizes}."
)
if isinstance(sizes, tuple):
if len(sizes) == 1:
sizes = sizes[0]
else:
type_name = type(self).__name__
raise ValueError(
f"The `repeat(...)` method of {type_name} can accept a size tuple with only one element."
f" This is because {type_name} supports only 1-dimensional storage."
f" The received size tuple is: {sizes}."
)
num_repetitions = int(sizes[0])
self_length = len(self)
result = ObjectArray(num_repetitions * self_length)
source_index = 0
for result_index in range(len(result)):
result[result_index] = self[source_index]
source_index = (source_index + 1) % self_length
return result
@property
def device(self) -> Device:
"""
The device which stores the elements of the ObjectArray.
In the case of ObjectArray, this property always returns
the CPU device.
Returns:
The CPU device, as a torch.device object.
"""
return self._device
@property
def dtype(self) -> DType:
"""
The dtype of the elements stored by the ObjectArray.
In the case of ObjectArray, the dtype is always `object`.
"""
return object
def __getitem__(self, i: Any) -> Any:
if is_integer(i):
index = int(self._indices[i])
return self._objects[index]
else:
indices = self._indices[i]
same_ptr = indices.storage().data_ptr() == self._indices.storage().data_ptr()
result = ObjectArray(len(indices))
if same_ptr:
result._indices[:] = indices
result._objects = self._objects
else:
result._objects = []
for index in indices:
result._objects.append(self._objects[int(index)])
result._read_only = self._read_only
return result
def __setitem__(self, i: Any, x: Any):
from .immutable import as_immutable
if self._read_only:
raise ValueError("This ObjectArray is read-only, therefore, modification is not allowed.")
if is_integer(i):
index = int(self._indices[i])
self._objects[index] = as_immutable(x)
else:
indices = self._indices[i]
if not isinstance(x, Iterable):
raise TypeError(f"Expected an iterable, but got {repr(x)}")
if not hasattr(x, "__len__"):
x = list(x)
if len(x) != len(indices):
raise TypeError(
f"The slicing operation refers to {len(indices)} elements."
f" However, the given objects sequence has {len(x)} elements."
)
for q, obj in enumerate(x):
index = int(indices[q])
self._objects[index] = as_immutable(obj)
def __len__(self) -> int:
return len(self._indices)
def __iter__(self):
for i in range(len(self)):
yield self[i]
def clone(self, *, memo: Optional[dict] = None) -> "ObjectArray":
"""
Get a deep copy of the ObjectArray.
Note that the newly made deep copy will NOT be read-only,
even if the original is.
Returns:
An non-read-only deep copy of the original ObjectArray.
"""
if memo is None:
memo = {}
result = ObjectArray(len(self))
for i in range(len(self)):
result[i] = deepcopy(self[i], memo=memo)
return result
def get_read_only_view(self) -> "ObjectArray":
"""
Get a read-only view of this ObjectArray.
"""
result = self[:]
result._read_only = True
return result
@property
def is_read_only(self) -> bool:
"""
True if this ObjectArray is read-only; False otherwise.
"""
return self._read_only
def __copy__(self) -> "ObjectArray":
return self.clone()
def __deepcopy__(self, memo: Optional[dict]) -> "ObjectArray":
return self.clone(memo=memo)
def __getstate__(self):
return self.clone().__dict__
def storage(self) -> ObjectArrayStorage:
return ObjectArrayStorage(self)
def _to_string(self) -> str:
inside = []
for ind in self._indices:
i = int(ind)
inside.append(self._objects[i])
type_name = type(self).__name__
details = [
"elements: " + repr(inside),
"ptr: " + repr(self.storage().data_ptr()),
]
if self.is_read_only:
details.append("is_read_only: " + repr(self.is_read_only))
details = ", ".join(details)
return f"<{type_name}, {details}>"
def __repr__(self) -> str:
return self._to_string()
def __str__(self) -> str:
return self._to_string()
def numpy(self) -> np.ndarray:
"""
Convert this ObjectArray to a numpy array.
The resulting numpy array will have its dtype set as `object`.
This new array itself and its contents will be mutable (those
mutable objects being the copies of their immutable sources).
Returns:
The numpy counterpart of this ObjectArray.
"""
from .immutable import mutable_copy
n = len(self)
result = np.empty(n, dtype=object)
for i, item in enumerate(self):
if isinstance(item, ObjectArray):
result[i] = item.numpy()
else:
result[i] = mutable_copy(item)
return result
@staticmethod
def from_numpy(ndarray: np.ndarray) -> "ObjectArray":
"""
Convert a numpy array of dtype `object` to an `ObjectArray`.
Args:
The numpy array that will be converted to `ObjectArray`.
Returns:
The ObjectArray counterpart of the given numpy array.
"""
if isinstance(ndarray, np.ndarray):
if ndarray.dtype == np.dtype(object):
n = len(ndarray)
result = ObjectArray(n)
for i, element in enumerate(ndarray):
result[i] = element
return result
else:
raise ValueError(
f"The dtype of the given array was expected as `object`."
f" However, the dtype was encountered as {ndarray.dtype}."
)
else:
raise TypeError(f"Expected a `numpy.ndarray` instance, but received an object of type {type(ndarray)}.")
device: Union[str, torch.device]
property
readonly
¶
The device which stores the elements of the ObjectArray. In the case of ObjectArray, this property always returns the CPU device.
Returns:
Type | Description |
---|---|
Union[str, torch.device] |
The CPU device, as a torch.device object. |
dtype: Union[str, torch.dtype, numpy.dtype, Type]
property
readonly
¶
The dtype of the elements stored by the ObjectArray.
In the case of ObjectArray, the dtype is always object
.
is_read_only: bool
property
readonly
¶
True if this ObjectArray is read-only; False otherwise.
ndim: int
property
readonly
¶
Number of dimensions handled by the ObjectArray. This is equivalent to getting the length of the size tuple.
shape: Union[int, torch.Size]
property
readonly
¶
Shape of the ObjectArray, as a PyTorch Size tuple.
__init__(self, size=None, *, slice_of=None)
special
¶
__init__(...)
: Instantiate a new ObjectArray.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Length of the ObjectArray. If this argument is present and
is an integer |
None |
slice_of |
Optional[tuple] |
Optionally a tuple in the form
|
None |
Source code in evotorch/tools/objectarray.py
def __init__(
self,
size: Optional[Size] = None,
*,
slice_of: Optional[tuple] = None,
):
"""
`__init__(...)`: Instantiate a new ObjectArray.
Args:
size: Length of the ObjectArray. If this argument is present and
is an integer `n`, then the resulting ObjectArray will be
of length `n`, and will be filled with `None` values.
This argument cannot be used together with the keyword
argument `slice_of`.
slice_of: Optionally a tuple in the form
`(original_object_tensor, slice_info)`.
When this argument is present, then the resulting ObjectArray
will be a slice of the given `original_object_tensor` (which
is expected as an ObjectArray instance). `slice_info` is
either a `slice` instance, or a sequence of integers.
The resulting ObjectArray might be a view of
`original_object_tensor` (i.e. it might share its memory with
`original_object_tensor`).
This keyword argument cannot be used together with the
argument `size`.
"""
if size is not None and slice_of is not None:
raise ValueError("Expected either `size` argument or `slice_of` argument, but got both.")
elif size is None and slice_of is None:
raise ValueError("Expected either `size` argument or `slice_of` argument, but got none.")
elif size is not None:
if not is_sequence(size):
length = size
elif isinstance(size, (np.ndarray, torch.Tensor)) and (size.ndim > 1):
raise ValueError(f"Invalid size: {size}")
else:
[length] = size
length = int(length)
self._indices = torch.arange(length, dtype=torch.int64)
self._objects = [None] * length
elif slice_of is not None:
source: ObjectArray
source, slicing = slice_of
if not isinstance(source, ObjectArray):
raise TypeError(
f"`slice_of`: The first element was expected as an ObjectArray."
f" But it is of type {repr(type(source))}"
)
if isinstance(slicing, tuple) or is_integer(slicing):
raise TypeError(f"Invalid slice: {slicing}")
self._indices = source._indices[slicing]
self._objects = source._objects
if self._indices.storage().data_ptr() != source._indices.storage().data_ptr():
self._objects = clone(self._objects)
self._device = torch.device("cpu")
self._read_only = False
clone(self, *, memo=None)
¶
Get a deep copy of the ObjectArray.
Note that the newly made deep copy will NOT be read-only, even if the original is.
Returns:
Type | Description |
---|---|
ObjectArray |
An non-read-only deep copy of the original ObjectArray. |
Source code in evotorch/tools/objectarray.py
def clone(self, *, memo: Optional[dict] = None) -> "ObjectArray":
"""
Get a deep copy of the ObjectArray.
Note that the newly made deep copy will NOT be read-only,
even if the original is.
Returns:
An non-read-only deep copy of the original ObjectArray.
"""
if memo is None:
memo = {}
result = ObjectArray(len(self))
for i in range(len(self)):
result[i] = deepcopy(self[i], memo=memo)
return result
dim(self)
¶
Get the number of dimensions handled by the ObjectArray. This is equivalent to getting the length of the size tuple.
Returns:
Type | Description |
---|---|
int |
The number of dimensions, as an integer. |
from_numpy(ndarray)
staticmethod
¶
Convert a numpy array of dtype object
to an ObjectArray
.
Returns:
Type | Description |
---|---|
ObjectArray |
The ObjectArray counterpart of the given numpy array. |
Source code in evotorch/tools/objectarray.py
@staticmethod
def from_numpy(ndarray: np.ndarray) -> "ObjectArray":
"""
Convert a numpy array of dtype `object` to an `ObjectArray`.
Args:
The numpy array that will be converted to `ObjectArray`.
Returns:
The ObjectArray counterpart of the given numpy array.
"""
if isinstance(ndarray, np.ndarray):
if ndarray.dtype == np.dtype(object):
n = len(ndarray)
result = ObjectArray(n)
for i, element in enumerate(ndarray):
result[i] = element
return result
else:
raise ValueError(
f"The dtype of the given array was expected as `object`."
f" However, the dtype was encountered as {ndarray.dtype}."
)
else:
raise TypeError(f"Expected a `numpy.ndarray` instance, but received an object of type {type(ndarray)}.")
get_read_only_view(self)
¶
numel(self)
¶
Number of elements stored by the ObjectArray.
Returns:
Type | Description |
---|---|
int |
The number of elements, as an integer. |
numpy(self)
¶
Convert this ObjectArray to a numpy array.
The resulting numpy array will have its dtype set as object
.
This new array itself and its contents will be mutable (those
mutable objects being the copies of their immutable sources).
Returns:
Type | Description |
---|---|
ndarray |
The numpy counterpart of this ObjectArray. |
Source code in evotorch/tools/objectarray.py
def numpy(self) -> np.ndarray:
"""
Convert this ObjectArray to a numpy array.
The resulting numpy array will have its dtype set as `object`.
This new array itself and its contents will be mutable (those
mutable objects being the copies of their immutable sources).
Returns:
The numpy counterpart of this ObjectArray.
"""
from .immutable import mutable_copy
n = len(self)
result = np.empty(n, dtype=object)
for i, item in enumerate(self):
if isinstance(item, ObjectArray):
result[i] = item.numpy()
else:
result[i] = mutable_copy(item)
return result
repeat(self, *sizes)
¶
Repeat the contents of this ObjectArray.
For example, if we have an ObjectArray objs
which stores
["hello", "world"]
, the following line:
objs.repeat(3)
will result in an ObjectArray which stores:
`["hello", "world", "hello", "world", "hello", "world"]`
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sizes |
Although this argument is named |
() |
Returns:
Type | Description |
---|---|
ObjectArray |
A new ObjectArray which repeats the original one's values |
Source code in evotorch/tools/objectarray.py
def repeat(self, *sizes) -> "ObjectArray":
"""
Repeat the contents of this ObjectArray.
For example, if we have an ObjectArray `objs` which stores
`["hello", "world"]`, the following line:
objs.repeat(3)
will result in an ObjectArray which stores:
`["hello", "world", "hello", "world", "hello", "world"]`
Args:
sizes: Although this argument is named `sizes` to be compatible
with PyTorch, what is expected here is a single positional
argument, as a single integer, or as a single-element
tuple.
The given integer (which can be the argument itself, or
the integer within the given single-element tuple),
specifies how many times the stored sequence will be
repeated.
Returns:
A new ObjectArray which repeats the original one's values
"""
if len(sizes) != 1:
type_name = type(self).__name__
raise ValueError(
f"The `repeat(...)` method of {type_name} expects exactly one positional argument."
f" This is because {type_name} supports only 1-dimensional storage."
f" The received positional arguments are: {sizes}."
)
if isinstance(sizes, tuple):
if len(sizes) == 1:
sizes = sizes[0]
else:
type_name = type(self).__name__
raise ValueError(
f"The `repeat(...)` method of {type_name} can accept a size tuple with only one element."
f" This is because {type_name} supports only 1-dimensional storage."
f" The received size tuple is: {sizes}."
)
num_repetitions = int(sizes[0])
self_length = len(self)
result = ObjectArray(num_repetitions * self_length)
source_index = 0
for result_index in range(len(result)):
result[result_index] = self[source_index]
source_index = (source_index + 1) % self_length
return result
size(self)
¶
Get the size of the ObjectArray, as a PyTorch Size tuple.
Returns:
Type | Description |
---|---|
Union[int, torch.Size] |
The size (i.e. the shape) of the ObjectArray. |
ranking
¶
This module contains ranking functions which work with PyTorch tensors.
centered(fitnesses, *, higher_is_better=True)
¶
Apply linearly spaced 0-centered ranking on a PyTorch tensor. The lowest weight is -0.5, and the highest weight is 0.5. This is the same ranking method that was used in:
Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, Ilya Sutskever (2017).
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fitnesses |
Tensor |
A PyTorch tensor which contains real numbers which we want to rank. |
required |
higher_is_better |
bool |
Whether or not the higher values will be assigned higher ranks. Changing this to False means that lower values are interpreted as better, and therefore lower values will have higher ranks. |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ranks, in the same device, with the same dtype with the original tensor. |
Source code in evotorch/tools/ranking.py
def centered(fitnesses: torch.Tensor, *, higher_is_better: bool = True) -> torch.Tensor:
"""
Apply linearly spaced 0-centered ranking on a PyTorch tensor.
The lowest weight is -0.5, and the highest weight is 0.5.
This is the same ranking method that was used in:
Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, Ilya Sutskever (2017).
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Args:
fitnesses: A PyTorch tensor which contains real numbers which we want
to rank.
higher_is_better: Whether or not the higher values will be assigned
higher ranks. Changing this to False means that lower values
are interpreted as better, and therefore lower values will have
higher ranks.
Returns:
The ranks, in the same device, with the same dtype with the original
tensor.
"""
device = fitnesses.device
dtype = fitnesses.dtype
with torch.no_grad():
x = fitnesses.reshape(-1)
n = len(x)
indices = x.argsort(descending=(not higher_is_better))
weights = (torch.arange(n, dtype=dtype, device=device) / (n - 1)) - 0.5
ranks = torch.empty_like(x)
ranks[indices] = weights
return ranks.reshape(*(fitnesses.shape))
linear(fitnesses, *, higher_is_better=True)
¶
Apply linearly spaced ranking on a PyTorch tensor. The lowest weight is 0, and the highest weight is 1.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fitnesses |
Tensor |
A PyTorch tensor which contains real numbers which we want to rank. |
required |
higher_is_better |
bool |
Whether or not the higher values will be assigned higher ranks. Changing this to False means that lower values are interpreted as better, and therefore lower values will have higher ranks. |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ranks, in the same device, with the same dtype with the original tensor. |
Source code in evotorch/tools/ranking.py
def linear(fitnesses: torch.Tensor, *, higher_is_better: bool = True) -> torch.Tensor:
"""
Apply linearly spaced ranking on a PyTorch tensor.
The lowest weight is 0, and the highest weight is 1.
Args:
fitnesses: A PyTorch tensor which contains real numbers which we want
to rank.
higher_is_better: Whether or not the higher values will be assigned
higher ranks. Changing this to False means that lower values
are interpreted as better, and therefore lower values will have
higher ranks.
Returns:
The ranks, in the same device, with the same dtype with the original
tensor.
"""
device = fitnesses.device
dtype = fitnesses.dtype
with torch.no_grad():
x = fitnesses.reshape(-1)
n = len(x)
indices = x.argsort(descending=(not higher_is_better))
weights = torch.arange(n, dtype=dtype, device=device) / (n - 1)
ranks = torch.empty_like(x)
ranks[indices] = weights
return ranks.reshape(*(fitnesses.shape))
nes(fitnesses, *, higher_is_better=True)
¶
Apply the ranking mechanism proposed in:
Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., & Schmidhuber, J. (2014).
Natural evolution strategies. The Journal of Machine Learning Research, 15(1), 949-980.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fitnesses |
Tensor |
A PyTorch tensor which contains real numbers which we want to rank. |
required |
higher_is_better |
bool |
Whether or not the higher values will be assigned higher ranks. Changing this to False means that lower values are interpreted as better, and therefore lower values will have higher ranks. |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ranks, in the same device, with the same dtype with the original tensor. |
Source code in evotorch/tools/ranking.py
def nes(fitnesses: torch.Tensor, *, higher_is_better: bool = True) -> torch.Tensor:
"""
Apply the ranking mechanism proposed in:
Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., & Schmidhuber, J. (2014).
Natural evolution strategies. The Journal of Machine Learning Research, 15(1), 949-980.
Args:
fitnesses: A PyTorch tensor which contains real numbers which we want
to rank.
higher_is_better: Whether or not the higher values will be assigned
higher ranks. Changing this to False means that lower values
are interpreted as better, and therefore lower values will have
higher ranks.
Returns:
The ranks, in the same device, with the same dtype with the original
tensor.
"""
device = fitnesses.device
dtype = fitnesses.dtype
with torch.no_grad():
x = fitnesses.reshape(-1)
n = len(x)
incr_indices = torch.arange(n, dtype=dtype, device=device)
N = torch.tensor(n, dtype=dtype, device=device)
weights = torch.max(
torch.tensor(0, dtype=dtype, device=device), torch.log((N / 2.0) + 1.0) - torch.log(N - incr_indices)
)
indices = torch.argsort(x, descending=(not higher_is_better))
ranks = torch.empty(n, dtype=indices.dtype, device=device)
ranks[indices] = torch.arange(n, dtype=indices.dtype, device=device)
utils = weights[ranks]
utils /= torch.sum(utils)
utils -= 1 / N
return utils.reshape(*(fitnesses.shape))
normalized(fitnesses, *, higher_is_better=True)
¶
Normalize the fitnesses and return the result as ranks.
The normalization is done in such a way that the mean becomes 0.0 and the standard deviation becomes 1.0.
According to the value of higher_is_better
, it will be ensured that
better solutions will have numerically higher rank.
In more details, if higher_is_better
is set as False, then the
fitnesses will be multiplied by -1.0 in addition to being subject
to normalization.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fitnesses |
Tensor |
A PyTorch tensor which contains real numbers which we want to rank. |
required |
higher_is_better |
bool |
Whether or not the higher values will be assigned higher ranks. Changing this to False means that lower values are interpreted as better, and therefore lower values will have higher ranks. |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ranks, in the same device, with the same dtype with the original tensor. |
Source code in evotorch/tools/ranking.py
def normalized(fitnesses: torch.Tensor, *, higher_is_better: bool = True) -> torch.Tensor:
"""
Normalize the fitnesses and return the result as ranks.
The normalization is done in such a way that the mean becomes 0.0 and
the standard deviation becomes 1.0.
According to the value of `higher_is_better`, it will be ensured that
better solutions will have numerically higher rank.
In more details, if `higher_is_better` is set as False, then the
fitnesses will be multiplied by -1.0 in addition to being subject
to normalization.
Args:
fitnesses: A PyTorch tensor which contains real numbers which we want
to rank.
higher_is_better: Whether or not the higher values will be assigned
higher ranks. Changing this to False means that lower values
are interpreted as better, and therefore lower values will have
higher ranks.
Returns:
The ranks, in the same device, with the same dtype with the original
tensor.
"""
if not higher_is_better:
fitnesses = -fitnesses
fitness_mean = torch.mean(fitnesses)
fitness_stdev = torch.std(fitnesses)
fitnesses = fitnesses - fitness_mean
fitnesses = fitnesses / fitness_stdev
return fitnesses
rank(fitnesses, ranking_method, *, higher_is_better)
¶
Get the ranks of the given sequence of numbers.
Better solutions will have numerically higher ranks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fitnesses |
Iterable[float] |
A sequence of numbers to be ranked. |
required |
ranking_method |
str |
The ranking method to be used.
Can be "centered", which means 0-centered linear ranking
from -0.5 to 0.5.
Can be "linear", which means a linear ranking from 0 to 1.
Can be "nes", which means the ranking method used by
Natural Evolution Strategies.
Can be "normalized", which means that the ranks will be
the normalized counterparts of the fitnesses.
Can be "raw", which means that the fitnesses themselves
(or, if |
required |
higher_is_better |
bool |
Whether or not the higher values will be assigned higher ranks. Changing this to False means that lower values are interpreted as better, and therefore lower values will have higher ranks. |
required |
Source code in evotorch/tools/ranking.py
def rank(fitnesses: Iterable[float], ranking_method: str, *, higher_is_better: bool):
"""
Get the ranks of the given sequence of numbers.
Better solutions will have numerically higher ranks.
Args:
fitnesses: A sequence of numbers to be ranked.
ranking_method: The ranking method to be used.
Can be "centered", which means 0-centered linear ranking
from -0.5 to 0.5.
Can be "linear", which means a linear ranking from 0 to 1.
Can be "nes", which means the ranking method used by
Natural Evolution Strategies.
Can be "normalized", which means that the ranks will be
the normalized counterparts of the fitnesses.
Can be "raw", which means that the fitnesses themselves
(or, if `higher_is_better` is False, their inverted
counterparts, inversion meaning the operation of
multiplying by -1 in this context) will be the ranks.
higher_is_better: Whether or not the higher values will be assigned
higher ranks. Changing this to False means that lower values
are interpreted as better, and therefore lower values will have
higher ranks.
"""
fitnesses = torch.as_tensor(fitnesses)
rank_func = rankers[ranking_method]
return rank_func(fitnesses, higher_is_better=higher_is_better)
raw(fitnesses, *, higher_is_better=True)
¶
Return the fitnesses themselves as ranks.
If higher_is_better
is given as False, then the fitnesses will first
be multiplied by -1 and then the result will be returned as ranks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fitnesses |
Tensor |
A PyTorch tensor which contains real numbers which we want to rank. |
required |
higher_is_better |
bool |
Whether or not the higher values will be assigned higher ranks. Changing this to False means that lower values are interpreted as better, and therefore lower values will have higher ranks. |
True |
Returns:
Type | Description |
---|---|
Tensor |
The ranks, in the same device, with the same dtype with the original tensor. |
Source code in evotorch/tools/ranking.py
def raw(fitnesses: torch.Tensor, *, higher_is_better: bool = True) -> torch.Tensor:
"""
Return the fitnesses themselves as ranks.
If `higher_is_better` is given as False, then the fitnesses will first
be multiplied by -1 and then the result will be returned as ranks.
Args:
fitnesses: A PyTorch tensor which contains real numbers which we want
to rank.
higher_is_better: Whether or not the higher values will be assigned
higher ranks. Changing this to False means that lower values
are interpreted as better, and therefore lower values will have
higher ranks.
Returns:
The ranks, in the same device, with the same dtype with the original
tensor.
"""
if not higher_is_better:
fitnesses = -fitnesses
return fitnesses
readonlytensor
¶
ReadOnlyTensor (Tensor)
¶
A special type of tensor which is read-only.
This is a subclass of torch.Tensor
which explicitly disallows
operations that would cause in-place modifications.
Since ReadOnlyTensor if a subclass of torch.Tensor
, most
non-destructive PyTorch operations are on this tensor are supported.
Cloning a ReadOnlyTensor using the clone()
method or Python's
deepcopy(...)
function results in a regular PyTorch tensor.
Reshaping or slicing operations might return a ReadOnlyTensor if the
result ends up being a view of the original ReadOnlyTensor; otherwise,
the returned tensor is a regular torch.Tensor
.
Source code in evotorch/tools/readonlytensor.py
class ReadOnlyTensor(torch.Tensor):
"""
A special type of tensor which is read-only.
This is a subclass of `torch.Tensor` which explicitly disallows
operations that would cause in-place modifications.
Since ReadOnlyTensor if a subclass of `torch.Tensor`, most
non-destructive PyTorch operations are on this tensor are supported.
Cloning a ReadOnlyTensor using the `clone()` method or Python's
`deepcopy(...)` function results in a regular PyTorch tensor.
Reshaping or slicing operations might return a ReadOnlyTensor if the
result ends up being a view of the original ReadOnlyTensor; otherwise,
the returned tensor is a regular `torch.Tensor`.
"""
def __getattribute__(self, attribute_name: str) -> Any:
if (
isinstance(attribute_name, str)
and attribute_name.endswith("_")
and (not ((attribute_name.startswith("__")) and (attribute_name.endswith("__"))))
):
raise AttributeError(
f"A ReadOnlyTensor explicitly disables all members whose names end with '_'."
f" Cannot access member {repr(attribute_name)}."
)
else:
return super().__getattribute__(attribute_name)
def __cannot_modify(self, *ignore, **ignore_too):
raise TypeError("The contents of a ReadOnlyTensor cannot be modified")
__setitem__ = __cannot_modify
__iadd__ = __cannot_modify
__iand__ = __cannot_modify
__idiv__ = __cannot_modify
__ifloordiv__ = __cannot_modify
__ilshift__ = __cannot_modify
__imatmul__ = __cannot_modify
__imod__ = __cannot_modify
__imul__ = __cannot_modify
__ior__ = __cannot_modify
__ipow__ = __cannot_modify
__irshift__ = __cannot_modify
__isub__ = __cannot_modify
__itruediv__ = __cannot_modify
__ixor__ = __cannot_modify
if _torch_older_than_1_12:
# Define __str__ and __repr__ for when using PyTorch 1.11 or older.
# With PyTorch 1.12, overriding __str__ and __repr__ are not necessary.
def __to_string(self) -> str:
s = super().__repr__()
if "\n" not in s:
return f"ReadOnlyTensor({super().__repr__()})"
else:
indenter = " " * 4
s = (indenter + s.replace("\n", "\n" + indenter)).rstrip()
return f"ReadOnlyTensor(\n{s}\n)"
__str__ = __to_string
__repr__ = __to_string
def clone(self) -> torch.Tensor:
return super().clone().as_subclass(torch.Tensor)
def __mutable_if_independent(self, other: torch.Tensor) -> torch.Tensor:
self_ptr = self.storage().data_ptr()
other_ptr = other.storage().data_ptr()
if self_ptr != other_ptr:
other = other.as_subclass(torch.Tensor)
return other
def __getitem__(self, index_or_slice) -> torch.Tensor:
result = super().__getitem__(index_or_slice)
return self.__mutable_if_independent(result)
def reshape(self, *args, **kwargs) -> torch.Tensor:
result = super().reshape(*args, **kwargs)
return self.__mutable_if_independent(result)
def numpy(self) -> np.ndarray:
arr: np.ndarray = torch.Tensor.numpy(self)
arr.flags["WRITEABLE"] = False
return arr
def __array__(self, *args, **kwargs) -> np.ndarray:
arr: np.ndarray = super().__array__(*args, **kwargs)
arr.flags["WRITEABLE"] = False
return arr
# def __copy__(self):
# return ReadOnlyTensor(copy(self.as_subclass(torch.Tensor)))
# def __deepcopy__(self, memo):
# return deepcopy(self.as_subclass(torch.Tensor), memo)
@classmethod
def __torch_function__(cls, func: Callable, types: Iterable, args: tuple = (), kwargs: Optional[Mapping] = None):
if (kwargs is not None) and ("out" in kwargs):
if isinstance(kwargs["out"], ReadOnlyTensor):
raise TypeError(
f"The `out` keyword argument passed to {func} is a ReadOnlyTensor."
f" A ReadOnlyTensor explicitly fails when referenced via the `out` keyword argument of any torch"
f" function."
f" This restriction is for making sure that the torch operations which could normally do in-place"
f" modifications do not operate on ReadOnlyTensor instances."
)
return super().__torch_function__(func, types, args, kwargs)
__torch_function__(func, types, args=(), kwargs=None)
classmethod
special
¶
This torch_function implementation wraps subclasses such that
methods called on subclasses return a subclass instance instead of
a torch.Tensor
instance.
One corollary to this is that you need coverage for torch.Tensor methods if implementing torch_function for subclasses.
We recommend always calling super().__torch_function__
as the base
case when doing the above.
While not mandatory, we recommend making __torch_function__
a classmethod.
Source code in evotorch/tools/readonlytensor.py
@classmethod
def __torch_function__(cls, func: Callable, types: Iterable, args: tuple = (), kwargs: Optional[Mapping] = None):
if (kwargs is not None) and ("out" in kwargs):
if isinstance(kwargs["out"], ReadOnlyTensor):
raise TypeError(
f"The `out` keyword argument passed to {func} is a ReadOnlyTensor."
f" A ReadOnlyTensor explicitly fails when referenced via the `out` keyword argument of any torch"
f" function."
f" This restriction is for making sure that the torch operations which could normally do in-place"
f" modifications do not operate on ReadOnlyTensor instances."
)
return super().__torch_function__(func, types, args, kwargs)
clone(self)
¶
numpy(self)
¶
numpy() -> numpy.ndarray
Returns :attr:self
tensor as a NumPy :class:ndarray
. This tensor and the
returned :class:ndarray
share the same underlying storage. Changes to
:attr:self
tensor will be reflected in the :class:ndarray
and vice versa.
reshape(self, *args, **kwargs)
¶
reshape(*shape) -> Tensor
Returns a tensor with the same data and number of elements as :attr:self
but with the specified shape. This method returns a view if :attr:shape
is
compatible with the current shape. See :meth:torch.Tensor.view
on when it is
possible to return a view.
See :func:torch.reshape
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shape |
tuple of ints or int... |
the desired shape |
required |
as_read_only_tensor(x, *, dtype=None, device=None)
¶
Convert the given object to a ReadOnlyTensor.
The provided object can be a scalar, or an Iterable of numeric data, or an ObjectArray.
This function can be thought as the read-only counterpart of PyTorch's
torch.as_tensor(...)
function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
The object to be converted to a ReadOnlyTensor. |
required |
dtype |
Optional[torch.dtype] |
The dtype of the new ReadOnlyTensor (e.g. torch.float32).
If this argument is not specified, dtype will be inferred from |
None |
device |
Union[str, torch.device] |
The device in which the ReadOnlyTensor will be stored
(e.g. "cpu").
If this argument is not specified, the device which is storing
the original |
None |
Returns:
Type | Description |
---|---|
Iterable |
The read-only counterpart of the provided object. |
Source code in evotorch/tools/readonlytensor.py
def as_read_only_tensor(
x: Any, *, dtype: Optional[torch.dtype] = None, device: Optional[Union[str, torch.device]] = None
) -> Iterable:
"""
Convert the given object to a ReadOnlyTensor.
The provided object can be a scalar, or an Iterable of numeric data,
or an ObjectArray.
This function can be thought as the read-only counterpart of PyTorch's
`torch.as_tensor(...)` function.
Args:
x: The object to be converted to a ReadOnlyTensor.
dtype: The dtype of the new ReadOnlyTensor (e.g. torch.float32).
If this argument is not specified, dtype will be inferred from `x`.
For example, if `x` is a PyTorch tensor or a numpy array, its
existing dtype will be kept.
device: The device in which the ReadOnlyTensor will be stored
(e.g. "cpu").
If this argument is not specified, the device which is storing
the original `x` will be re-used.
Returns:
The read-only counterpart of the provided object.
"""
from .objectarray import ObjectArray
kwargs = _device_and_dtype_kwargs(dtype=dtype, device=device)
if isinstance(x, ObjectArray):
if len(kwargs) != 0:
raise ValueError(
f"read_only_tensor(...): when making a read-only tensor from an ObjectArray,"
f" the arguments `dtype` and `device` were not expected."
f" However, the received keyword arguments are: {kwargs}."
)
return x.get_read_only_view()
else:
return torch.as_tensor(x, **kwargs).as_subclass(ReadOnlyTensor)
read_only_tensor(x, *, dtype=None, device=None)
¶
Make a ReadOnlyTensor from the given object.
The provided object can be a scalar, or an Iterable of numeric data, or an ObjectArray.
This function can be thought as the read-only counterpart of PyTorch's
torch.tensor(...)
function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
The object from which the new ReadOnlyTensor will be made. |
required |
dtype |
Optional[torch.dtype] |
The dtype of the new ReadOnlyTensor (e.g. torch.float32). |
None |
device |
Union[str, torch.device] |
The device in which the ReadOnlyTensor will be stored (e.g. "cpu"). |
None |
Returns:
Type | Description |
---|---|
Iterable |
The new read-only tensor. |
Source code in evotorch/tools/readonlytensor.py
def read_only_tensor(
x: Any, *, dtype: Optional[torch.dtype] = None, device: Optional[Union[str, torch.device]] = None
) -> Iterable:
"""
Make a ReadOnlyTensor from the given object.
The provided object can be a scalar, or an Iterable of numeric data,
or an ObjectArray.
This function can be thought as the read-only counterpart of PyTorch's
`torch.tensor(...)` function.
Args:
x: The object from which the new ReadOnlyTensor will be made.
dtype: The dtype of the new ReadOnlyTensor (e.g. torch.float32).
device: The device in which the ReadOnlyTensor will be stored
(e.g. "cpu").
Returns:
The new read-only tensor.
"""
from .objectarray import ObjectArray
kwargs = _device_and_dtype_kwargs(dtype=dtype, device=device)
if isinstance(x, ObjectArray):
if len(kwargs) != 0:
raise ValueError(
f"read_only_tensor(...): when making a read-only tensor from an ObjectArray,"
f" the arguments `dtype` and `device` were not expected."
f" However, the received keyword arguments are: {kwargs}."
)
return x.get_read_only_view()
else:
return torch.as_tensor(x, **kwargs).as_subclass(ReadOnlyTensor)
tensormaker
¶
Base classes with various utilities for creating tensors.
TensorMakerMixin
¶
Source code in evotorch/tools/tensormaker.py
class TensorMakerMixin:
def __get_dtype_and_device_kwargs(
self,
*,
dtype: Optional[DType],
device: Optional[Device],
use_eval_dtype: bool,
out: Optional[Iterable],
) -> dict:
result = {}
if out is None:
if dtype is None:
if use_eval_dtype:
if hasattr(self, "eval_dtype"):
result["dtype"] = self.eval_dtype
else:
raise AttributeError(
f"Received `use_eval_dtype` as {repr(use_eval_dtype)}, which represents boolean truth."
f" However, evaluation dtype cannot be determined, because this object does not have"
f" an attribute named `eval_dtype`."
)
else:
result["dtype"] = self.dtype
else:
if use_eval_dtype:
raise ValueError(
f"Received both a `dtype` argument ({repr(dtype)}) and `use_eval_dtype` as True."
f" These arguments are conflicting."
f" Please either provide a `dtype`, or leave `dtype` as None and pass `use_eval_dtype=True`."
)
else:
result["dtype"] = dtype
if device is None:
result["device"] = self.device
else:
result["device"] = device
return result
def __get_size_args(self, *size: Size, num_solutions: Optional[int], out: Optional[Iterable]) -> tuple:
if out is None:
nsize = len(size)
if (nsize == 0) and (num_solutions is None):
return tuple()
elif (nsize >= 1) and (num_solutions is None):
return size
elif (nsize == 0) and (num_solutions is not None):
if hasattr(self, "solution_length"):
num_solutions = int(num_solutions)
if self.solution_length is None:
return (num_solutions,)
else:
return (num_solutions, self.solution_length)
else:
raise AttributeError(
f"Received `num_solutions` as {repr(num_solutions)}."
f" However, to determine the target tensor's size via `num_solutions`, this object"
f" needs to have an attribute named `solution_length`, which seems to be missing."
)
else:
raise ValueError(
f"Encountered both `size` arguments ({repr(size)})"
f" and `num_solutions` keyword argument (num_solutions={repr(num_solutions)})."
f" Specifying both `size` and `num_solutions` is not valid."
)
else:
return tuple()
def __get_generator_kwargs(self, *, generator: Any) -> dict:
result = {}
if generator is None:
if hasattr(self, "generator"):
result["generator"] = self.generator
else:
result["generator"] = generator
return result
def __get_all_args_for_maker(
self,
*size: Size,
num_solutions: Optional[int],
out: Optional[Iterable],
dtype: Optional[DType],
device: Optional[Device],
use_eval_dtype: bool,
) -> tuple:
args = self.__get_size_args(*size, num_solutions=num_solutions, out=out)
kwargs = self.__get_dtype_and_device_kwargs(dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=out)
if out is not None:
kwargs["out"] = out
return args, kwargs
def __get_all_args_for_random_maker(
self,
*size: Size,
num_solutions: Optional[int],
out: Optional[Iterable],
dtype: Optional[DType],
device: Optional[Device],
use_eval_dtype: bool,
generator: Any,
):
args = self.__get_size_args(*size, num_solutions=num_solutions, out=out)
kwargs = {}
kwargs.update(
self.__get_dtype_and_device_kwargs(dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=out)
)
kwargs.update(self.__get_generator_kwargs(generator=generator))
if out is not None:
kwargs["out"] = out
return args, kwargs
def make_tensor(
self,
data: Any,
*,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
read_only: bool = False,
) -> Iterable:
"""
Make a new tensor.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
data: The data to be converted to a tensor.
If one wishes to create a PyTorch tensor, this can be anything
that can be stored by a PyTorch tensor.
If one wishes to create an `ObjectArray` and therefore passes
`dtype=object`, then the provided `data` is expected as an
`Iterable`.
dtype: Optionally a string (e.g. "float32"), or a PyTorch dtype
(e.g. torch.float32), or `object` or "object" (as a string)
or `Any` if one wishes to create an `ObjectArray`.
If `dtype` is not specified it will be assumed that the user
wishes to create a tensor using the dtype of this method's
parent object.
device: The device in which the tensor will be stored.
If `device` is not specified, it will be assumed that the user
wishes to create a tensor on the device of this method's
parent object.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
read_only: Whether or not the created tensor will be read-only.
By default, this is False.
Returns:
A PyTorch tensor or an ObjectArray.
"""
kwargs = self.__get_dtype_and_device_kwargs(dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=None)
return misc.make_tensor(data, read_only=read_only, **kwargs)
def make_empty(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[Iterable] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> Iterable:
"""
Make an empty tensor.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Shape of the empty tensor to be created.
expected as multiple positional arguments of integers,
or as a single positional argument containing a tuple of
integers.
Note that when the user wishes to create an `ObjectArray`
(i.e. when `dtype` is given as `object`), then the size
is expected as a single integer, or as a single-element
tuple containing an integer (because `ObjectArray` can only
be one-dimensional).
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an `ObjectArray`,
"object" (as string) or `object` or `Any`.
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The new empty tensor, which can be a PyTorch tensor or an
`ObjectArray`.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_empty(*args, **kwargs)
def make_zeros(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new tensor filled with 0, or fill an existing tensor with 0.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with 0.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
0 values, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
out: Optionally, the tensor to be filled by 0 values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing 0 values.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_zeros(*args, **kwargs)
def make_ones(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new tensor filled with 1, or fill an existing tensor with 1.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with 1.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
1 values, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
out: Optionally, the tensor to be filled by 1 values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing 1 values.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_ones(*args, **kwargs)
def make_nan(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new tensor filled with NaN values, or fill an existing tensor
with NaN values.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with NaN.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
NaN values, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
out: Optionally, the tensor to be filled by NaN values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing NaN values.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_nan(*args, **kwargs)
def make_I(
self,
size: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new identity matrix (I), or change an existing tensor so that
it expresses the identity matrix.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: A single integer specifying the length of the target square
matrix. In this context, "length" means both rowwise length
and columnwise length, since the target is a square matrix.
Note that, if the user wishes to fill an existing tensor with
identity values, then `size` is expected to be left as None.
out: Optionally, the existing tensor whose values will be changed
so that they represent an identity matrix.
If an `out` tensor is given, then `size` is expected as None.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing the I matrix values
"""
if (len(size) == 0) and (out is None):
if hasattr(self, "solution_length"):
size = self.solution_length
else:
raise AttributeError(
"The method `.make_I(...)` was used without any `size`"
" arguments."
" When the `size` argument is missing, the default"
" behavior of this method is to create an identity matrix"
" of size (n, n), n being the length of a solution."
" However, the parent object of this method does not have"
" an attribute name `solution_length`."
)
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=None,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_I(*args, **kwargs)
def make_uniform(
self,
*size: Size,
num_solutions: Optional[int] = None,
lb: Optional[RealOrVector] = None,
ub: Optional[RealOrVector] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by uniformly distributed values.
Both lower and upper bounds are inclusive.
This function can work with both float and int dtypes.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with uniformly distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
lb: Lower bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the lower bound will be taken as 0.
Note that, if one specifies `lb`, then `ub` is also expected to
be explicitly specified.
ub: Upper bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the upper bound will be taken as 1.
Note that, if one specifies `ub`, then `lb` is also expected to
be explicitly specified.
out: Optionally, the tensor to be filled by uniformly distributed
values. If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
generator: Pseudo-random generator to be used when sampling
the values. Can be a `torch.Generator` or any object with
a `generator` attribute (e.g. a Problem object).
If not given, then this method's parent object will be
analyzed whether or not it has its own generator.
If it does, that generator will be used.
If not, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the uniformly
distributed values.
"""
args, kwargs = self.__get_all_args_for_random_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
generator=generator,
)
return misc.make_uniform(*args, lb=lb, ub=ub, **kwargs)
def make_gaussian(
self,
*size: Size,
num_solutions: Optional[int] = None,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
symmetric: bool = False,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by Gaussian distributed values.
This function can work only with float dtypes.
Args:
size: Size of the new tensor to be filled with Gaussian distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
center: Center point (i.e. mean) of the Gaussian distribution.
Can be a scalar, or a tensor.
If not specified, the center point will be taken as 0.
Note that, if one specifies `center`, then `stdev` is also
expected to be explicitly specified.
stdev: Standard deviation for the Gaussian distributed values.
Can be a scalar, or a tensor.
If not specified, the standard deviation will be taken as 1.
Note that, if one specifies `stdev`, then `center` is also
expected to be explicitly specified.
symmetric: Whether or not the values should be sampled in a
symmetric (i.e. antithetic) manner.
The default is False.
out: Optionally, the tensor to be filled by Gaussian distributed
values. If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
generator: Pseudo-random generator to be used when sampling
the values. Can be a `torch.Generator` or any object with
a `generator` attribute (e.g. a Problem object).
If not given, then this method's parent object will be
analyzed whether or not it has its own generator.
If it does, that generator will be used.
If not, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the Gaussian
distributed values.
"""
args, kwargs = self.__get_all_args_for_random_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
generator=generator,
)
return misc.make_gaussian(*args, center=center, stdev=stdev, symmetric=symmetric, **kwargs)
def make_randint(
self,
*size: Size,
n: Union[int, float, torch.Tensor],
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by random integers.
The integers are uniformly distributed within `[0 ... n-1]`.
This function can be used with integer or float dtypes.
Args:
size: Size of the new tensor to be filled with uniformly distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
n: Number of choice(s) for integer sampling.
The lowest possible value will be 0, and the highest possible
value will be n - 1.
`n` can be a scalar, or a tensor.
out: Optionally, the tensor to be filled by the random integers.
If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "int64") or a PyTorch dtype
(e.g. torch.int64).
If `dtype` is not specified (and also `out` is None),
`torch.int64` will be used.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
generator: Pseudo-random generator to be used when sampling
the values. Can be a `torch.Generator` or any object with
a `generator` attribute (e.g. a Problem object).
If not given, then this method's parent object will be
analyzed whether or not it has its own generator.
If it does, that generator will be used.
If not, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the uniformly
distributed values.
"""
if (dtype is None) and (out is None):
dtype = torch.int64
args, kwargs = self.__get_all_args_for_random_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
generator=generator,
)
return misc.make_randint(*args, n=n, **kwargs)
def as_tensor(
self,
x: Any,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Get the tensor counterpart of the given object `x`.
Args:
x: Any object to be converted to a tensor.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an `ObjectArray`,
"object" (as string) or `object` or `Any`.
If `dtype` is not specified, the dtype of this method's
parent object will be used.
device: The device in which the resulting tensor will be stored.
If `device` is not specified, the device of this method's
parent object will be used.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The tensor counterpart of the given object `x`.
"""
kwargs = self.__get_dtype_and_device_kwargs(dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=None)
return misc.as_tensor(x, **kwargs)
def ensure_tensor_length_and_dtype(
self,
t: Any,
length: Optional[int] = None,
dtype: Optional[DType] = None,
about: Optional[str] = None,
*,
allow_scalar: bool = False,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> Iterable:
"""
Return the given sequence as a tensor while also confirming its
length, dtype, and device.
Default length, dtype, device are taken from this method's
parent object.
In more details, these attributes belonging to this method's parent
object will be used for determining the the defaults:
`solution_length`, `dtype`, and `device`.
Args:
t: The tensor, or a sequence which is convertible to a tensor.
length: The length to which the tensor is expected to conform.
If missing, the `solution_length` attribute of this method's
parent object will be used as the default value.
dtype: The dtype to which the tensor is expected to conform.
If `dtype` argument is missing and `use_eval_dtype` is False,
then the default dtype will be determined by the `dtype`
attribute of this method's parent object.
If `dtype` argument is missing and `use_eval_dtype` is True,
then the default dtype will be determined by the `eval_dtype`
attribute of this method's parent object.
about: The prefix for the error message. Can be left as None.
allow_scalar: Whether or not to accept scalars in addition
to vector of the desired length.
If `allow_scalar` is False, then scalars will be converted
to sequences of the desired length. The sequence will contain
the same scalar, repeated.
If `allow_scalar` is True, then the scalar itself will be
converted to a PyTorch scalar, and then will be returned.
device: The device in which the sequence is to be stored.
If the given sequence is on a different device than the
desired device, a copy on the correct device will be made.
If device is None, the default behavior of `torch.tensor(...)`
will be used, that is: if `t` is already a tensor, the result
will be on the same device, otherwise, the result will be on
the cpu.
use_eval_dtype: Whether or not to use the evaluation dtype
(instead of the dtype of decision values).
If this is given as True, the `dtype` argument is expected
as None.
If `dtype` argument is missing and `use_eval_dtype` is False,
then the default dtype will be determined by the `dtype`
attribute of this method's parent object.
If `dtype` argument is missing and `use_eval_dtype` is True,
then the default dtype will be determined by the `eval_dtype`
attribute of this method's parent object.
Returns:
The sequence whose correctness in terms of length, dtype, and
device is ensured.
Raises:
ValueError: if there is a length mismatch.
"""
if length is None:
if hasattr(self, "solution_length"):
length = self.solution_length
else:
raise AttributeError(
f"{about}: The argument `length` was found to be None."
f" When the `length` argument is None, the default behavior is to use the `solution_length`"
f" attribute of this method's parent object."
f" However, this method's parent object does NOT have a `solution_length` attribute."
)
dtype_and_device = self.__get_dtype_and_device_kwargs(
dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=None
)
return misc.ensure_tensor_length_and_dtype(
t, length=length, about=about, allow_scalar=allow_scalar, **dtype_and_device
)
def make_uniform_shaped_like(
self,
t: torch.Tensor,
*,
lb: Optional[RealOrVector] = None,
ub: Optional[RealOrVector] = None,
) -> torch.Tensor:
"""
Make a new uniformly-filled tensor, shaped like the given tensor.
The `dtype` and `device` will be determined by the parent of this
method (not by the given tensor).
If the parent of this method has its own random generator, then that
generator will be used.
Args:
t: The tensor according to which the result will be shaped.
lb: The inclusive lower bounds for the uniform distribution.
Can be a scalar or a tensor.
If left as None, 0.0 will be used as the upper bound.
ub: The inclusive upper bounds for the uniform distribution.
Can be a scalar or a tensor.
If left as None, 1.0 will be used as the upper bound.
Returns:
A new tensor whose shape is the same with the given tensor.
"""
return self.make_uniform(t.shape, lb=lb, ub=ub)
def make_gaussian_shaped_like(
self,
t: torch.Tensor,
*,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
) -> torch.Tensor:
"""
Make a new tensor, shaped like the given tensor, with its values
filled by the Gaussian distribution.
The `dtype` and `device` will be determined by the parent of this
method (not by the given tensor).
If the parent of this method has its own random generator, then that
generator will be used.
Args:
t: The tensor according to which the result will be shaped.
center: Center point for the Gaussian distribution.
Can be a scalar or a tensor.
If left as None, 0.0 will be used as the center point.
stdev: The standard deviation for the Gaussian distribution.
Can be a scalar or a tensor.
If left as None, 1.0 will be used as the standard deviation.
Returns:
A new tensor whose shape is the same with the given tensor.
"""
return self.make_gaussian(t.shape, center=center, stdev=stdev)
as_tensor(self, x, dtype=None, device=None, use_eval_dtype=False)
¶
Get the tensor counterpart of the given object x
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Any |
Any object to be converted to a tensor. |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an |
None |
device |
Union[str, torch.device] |
The device in which the resulting tensor will be stored.
If |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
Returns:
Type | Description |
---|---|
Tensor |
The tensor counterpart of the given object |
Source code in evotorch/tools/tensormaker.py
def as_tensor(
self,
x: Any,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Get the tensor counterpart of the given object `x`.
Args:
x: Any object to be converted to a tensor.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an `ObjectArray`,
"object" (as string) or `object` or `Any`.
If `dtype` is not specified, the dtype of this method's
parent object will be used.
device: The device in which the resulting tensor will be stored.
If `device` is not specified, the device of this method's
parent object will be used.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The tensor counterpart of the given object `x`.
"""
kwargs = self.__get_dtype_and_device_kwargs(dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=None)
return misc.as_tensor(x, **kwargs)
ensure_tensor_length_and_dtype(self, t, length=None, dtype=None, about=None, *, allow_scalar=False, device=None, use_eval_dtype=False)
¶
Return the given sequence as a tensor while also confirming its length, dtype, and device.
Default length, dtype, device are taken from this method's
parent object.
In more details, these attributes belonging to this method's parent
object will be used for determining the the defaults:
solution_length
, dtype
, and device
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Any |
The tensor, or a sequence which is convertible to a tensor. |
required |
length |
Optional[int] |
The length to which the tensor is expected to conform.
If missing, the |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
The dtype to which the tensor is expected to conform.
If |
None |
about |
Optional[str] |
The prefix for the error message. Can be left as None. |
None |
allow_scalar |
bool |
Whether or not to accept scalars in addition
to vector of the desired length.
If |
False |
device |
Union[str, torch.device] |
The device in which the sequence is to be stored.
If the given sequence is on a different device than the
desired device, a copy on the correct device will be made.
If device is None, the default behavior of |
None |
use_eval_dtype |
bool |
Whether or not to use the evaluation dtype
(instead of the dtype of decision values).
If this is given as True, the |
False |
Returns:
Type | Description |
---|---|
Iterable |
The sequence whose correctness in terms of length, dtype, and device is ensured. |
Exceptions:
Type | Description |
---|---|
ValueError |
if there is a length mismatch. |
Source code in evotorch/tools/tensormaker.py
def ensure_tensor_length_and_dtype(
self,
t: Any,
length: Optional[int] = None,
dtype: Optional[DType] = None,
about: Optional[str] = None,
*,
allow_scalar: bool = False,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> Iterable:
"""
Return the given sequence as a tensor while also confirming its
length, dtype, and device.
Default length, dtype, device are taken from this method's
parent object.
In more details, these attributes belonging to this method's parent
object will be used for determining the the defaults:
`solution_length`, `dtype`, and `device`.
Args:
t: The tensor, or a sequence which is convertible to a tensor.
length: The length to which the tensor is expected to conform.
If missing, the `solution_length` attribute of this method's
parent object will be used as the default value.
dtype: The dtype to which the tensor is expected to conform.
If `dtype` argument is missing and `use_eval_dtype` is False,
then the default dtype will be determined by the `dtype`
attribute of this method's parent object.
If `dtype` argument is missing and `use_eval_dtype` is True,
then the default dtype will be determined by the `eval_dtype`
attribute of this method's parent object.
about: The prefix for the error message. Can be left as None.
allow_scalar: Whether or not to accept scalars in addition
to vector of the desired length.
If `allow_scalar` is False, then scalars will be converted
to sequences of the desired length. The sequence will contain
the same scalar, repeated.
If `allow_scalar` is True, then the scalar itself will be
converted to a PyTorch scalar, and then will be returned.
device: The device in which the sequence is to be stored.
If the given sequence is on a different device than the
desired device, a copy on the correct device will be made.
If device is None, the default behavior of `torch.tensor(...)`
will be used, that is: if `t` is already a tensor, the result
will be on the same device, otherwise, the result will be on
the cpu.
use_eval_dtype: Whether or not to use the evaluation dtype
(instead of the dtype of decision values).
If this is given as True, the `dtype` argument is expected
as None.
If `dtype` argument is missing and `use_eval_dtype` is False,
then the default dtype will be determined by the `dtype`
attribute of this method's parent object.
If `dtype` argument is missing and `use_eval_dtype` is True,
then the default dtype will be determined by the `eval_dtype`
attribute of this method's parent object.
Returns:
The sequence whose correctness in terms of length, dtype, and
device is ensured.
Raises:
ValueError: if there is a length mismatch.
"""
if length is None:
if hasattr(self, "solution_length"):
length = self.solution_length
else:
raise AttributeError(
f"{about}: The argument `length` was found to be None."
f" When the `length` argument is None, the default behavior is to use the `solution_length`"
f" attribute of this method's parent object."
f" However, this method's parent object does NOT have a `solution_length` attribute."
)
dtype_and_device = self.__get_dtype_and_device_kwargs(
dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=None
)
return misc.ensure_tensor_length_and_dtype(
t, length=length, about=about, allow_scalar=allow_scalar, **dtype_and_device
)
make_I(self, size=None, out=None, dtype=None, device=None, use_eval_dtype=False)
¶
Make a new identity matrix (I), or change an existing tensor so that it expresses the identity matrix.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Optional[int] |
A single integer specifying the length of the target square
matrix. In this context, "length" means both rowwise length
and columnwise length, since the target is a square matrix.
Note that, if the user wishes to fill an existing tensor with
identity values, then |
None |
out |
Optional[torch.Tensor] |
Optionally, the existing tensor whose values will be changed
so that they represent an identity matrix.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the I matrix values |
Source code in evotorch/tools/tensormaker.py
def make_I(
self,
size: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new identity matrix (I), or change an existing tensor so that
it expresses the identity matrix.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: A single integer specifying the length of the target square
matrix. In this context, "length" means both rowwise length
and columnwise length, since the target is a square matrix.
Note that, if the user wishes to fill an existing tensor with
identity values, then `size` is expected to be left as None.
out: Optionally, the existing tensor whose values will be changed
so that they represent an identity matrix.
If an `out` tensor is given, then `size` is expected as None.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing the I matrix values
"""
if (len(size) == 0) and (out is None):
if hasattr(self, "solution_length"):
size = self.solution_length
else:
raise AttributeError(
"The method `.make_I(...)` was used without any `size`"
" arguments."
" When the `size` argument is missing, the default"
" behavior of this method is to create an identity matrix"
" of size (n, n), n being the length of a solution."
" However, the parent object of this method does not have"
" an attribute name `solution_length`."
)
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=None,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_I(*args, **kwargs)
make_empty(self, *size, *, num_solutions=None, out=None, dtype=None, device=None, use_eval_dtype=False)
¶
Make an empty tensor.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Shape of the empty tensor to be created.
expected as multiple positional arguments of integers,
or as a single positional argument containing a tuple of
integers.
Note that when the user wishes to create an |
() |
num_solutions |
Optional[int] |
This can be used instead of the |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
Returns:
Type | Description |
---|---|
Iterable |
The new empty tensor, which can be a PyTorch tensor or an
|
Source code in evotorch/tools/tensormaker.py
def make_empty(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[Iterable] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> Iterable:
"""
Make an empty tensor.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Shape of the empty tensor to be created.
expected as multiple positional arguments of integers,
or as a single positional argument containing a tuple of
integers.
Note that when the user wishes to create an `ObjectArray`
(i.e. when `dtype` is given as `object`), then the size
is expected as a single integer, or as a single-element
tuple containing an integer (because `ObjectArray` can only
be one-dimensional).
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32) or, for creating an `ObjectArray`,
"object" (as string) or `object` or `Any`.
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The new empty tensor, which can be a PyTorch tensor or an
`ObjectArray`.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_empty(*args, **kwargs)
make_gaussian(self, *size, *, num_solutions=None, center=None, stdev=None, symmetric=False, out=None, dtype=None, device=None, use_eval_dtype=False, generator=None)
¶
Make a new or existing tensor filled by Gaussian distributed values. This function can work only with float dtypes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with Gaussian distributed values. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor instead, then no positional argument is expected. |
() |
num_solutions |
Optional[int] |
This can be used instead of the |
None |
center |
Union[float, Iterable[float], torch.Tensor] |
Center point (i.e. mean) of the Gaussian distribution.
Can be a scalar, or a tensor.
If not specified, the center point will be taken as 0.
Note that, if one specifies |
None |
stdev |
Union[float, Iterable[float], torch.Tensor] |
Standard deviation for the Gaussian distributed values.
Can be a scalar, or a tensor.
If not specified, the standard deviation will be taken as 1.
Note that, if one specifies |
None |
symmetric |
bool |
Whether or not the values should be sampled in a symmetric (i.e. antithetic) manner. The default is False. |
False |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by Gaussian distributed
values. If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
generator |
Any |
Pseudo-random generator to be used when sampling
the values. Can be a |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the Gaussian distributed values. |
Source code in evotorch/tools/tensormaker.py
def make_gaussian(
self,
*size: Size,
num_solutions: Optional[int] = None,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
symmetric: bool = False,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by Gaussian distributed values.
This function can work only with float dtypes.
Args:
size: Size of the new tensor to be filled with Gaussian distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
center: Center point (i.e. mean) of the Gaussian distribution.
Can be a scalar, or a tensor.
If not specified, the center point will be taken as 0.
Note that, if one specifies `center`, then `stdev` is also
expected to be explicitly specified.
stdev: Standard deviation for the Gaussian distributed values.
Can be a scalar, or a tensor.
If not specified, the standard deviation will be taken as 1.
Note that, if one specifies `stdev`, then `center` is also
expected to be explicitly specified.
symmetric: Whether or not the values should be sampled in a
symmetric (i.e. antithetic) manner.
The default is False.
out: Optionally, the tensor to be filled by Gaussian distributed
values. If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
generator: Pseudo-random generator to be used when sampling
the values. Can be a `torch.Generator` or any object with
a `generator` attribute (e.g. a Problem object).
If not given, then this method's parent object will be
analyzed whether or not it has its own generator.
If it does, that generator will be used.
If not, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the Gaussian
distributed values.
"""
args, kwargs = self.__get_all_args_for_random_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
generator=generator,
)
return misc.make_gaussian(*args, center=center, stdev=stdev, symmetric=symmetric, **kwargs)
make_gaussian_shaped_like(self, t, *, center=None, stdev=None)
¶
Make a new tensor, shaped like the given tensor, with its values filled by the Gaussian distribution.
The dtype
and device
will be determined by the parent of this
method (not by the given tensor).
If the parent of this method has its own random generator, then that
generator will be used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Tensor |
The tensor according to which the result will be shaped. |
required |
center |
Union[float, Iterable[float], torch.Tensor] |
Center point for the Gaussian distribution. Can be a scalar or a tensor. If left as None, 0.0 will be used as the center point. |
None |
stdev |
Union[float, Iterable[float], torch.Tensor] |
The standard deviation for the Gaussian distribution. Can be a scalar or a tensor. If left as None, 1.0 will be used as the standard deviation. |
None |
Returns:
Type | Description |
---|---|
Tensor |
A new tensor whose shape is the same with the given tensor. |
Source code in evotorch/tools/tensormaker.py
def make_gaussian_shaped_like(
self,
t: torch.Tensor,
*,
center: Optional[RealOrVector] = None,
stdev: Optional[RealOrVector] = None,
) -> torch.Tensor:
"""
Make a new tensor, shaped like the given tensor, with its values
filled by the Gaussian distribution.
The `dtype` and `device` will be determined by the parent of this
method (not by the given tensor).
If the parent of this method has its own random generator, then that
generator will be used.
Args:
t: The tensor according to which the result will be shaped.
center: Center point for the Gaussian distribution.
Can be a scalar or a tensor.
If left as None, 0.0 will be used as the center point.
stdev: The standard deviation for the Gaussian distribution.
Can be a scalar or a tensor.
If left as None, 1.0 will be used as the standard deviation.
Returns:
A new tensor whose shape is the same with the given tensor.
"""
return self.make_gaussian(t.shape, center=center, stdev=stdev)
make_nan(self, *size, *, num_solutions=None, out=None, dtype=None, device=None, use_eval_dtype=False)
¶
Make a new tensor filled with NaN values, or fill an existing tensor with NaN values.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with NaN. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor with NaN values, then no positional argument is expected. |
() |
num_solutions |
Optional[int] |
This can be used instead of the |
None |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by NaN values.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing NaN values. |
Source code in evotorch/tools/tensormaker.py
def make_nan(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new tensor filled with NaN values, or fill an existing tensor
with NaN values.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with NaN.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
NaN values, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
out: Optionally, the tensor to be filled by NaN values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing NaN values.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_nan(*args, **kwargs)
make_ones(self, *size, *, num_solutions=None, out=None, dtype=None, device=None, use_eval_dtype=False)
¶
Make a new tensor filled with 1, or fill an existing tensor with 1.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with 1. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor with 1 values, then no positional argument is expected. |
() |
num_solutions |
Optional[int] |
This can be used instead of the |
None |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by 1 values.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing 1 values. |
Source code in evotorch/tools/tensormaker.py
def make_ones(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new tensor filled with 1, or fill an existing tensor with 1.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with 1.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
1 values, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
out: Optionally, the tensor to be filled by 1 values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing 1 values.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_ones(*args, **kwargs)
make_randint(self, *size, *, n, num_solutions=None, out=None, dtype=None, device=None, use_eval_dtype=False, generator=None)
¶
Make a new or existing tensor filled by random integers.
The integers are uniformly distributed within [0 ... n-1]
.
This function can be used with integer or float dtypes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with uniformly distributed values. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor instead, then no positional argument is expected. |
() |
n |
Union[int, float, torch.Tensor] |
Number of choice(s) for integer sampling.
The lowest possible value will be 0, and the highest possible
value will be n - 1.
|
required |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by the random integers.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "int64") or a PyTorch dtype
(e.g. torch.int64).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
generator |
Any |
Pseudo-random generator to be used when sampling
the values. Can be a |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the uniformly distributed values. |
Source code in evotorch/tools/tensormaker.py
def make_randint(
self,
*size: Size,
n: Union[int, float, torch.Tensor],
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by random integers.
The integers are uniformly distributed within `[0 ... n-1]`.
This function can be used with integer or float dtypes.
Args:
size: Size of the new tensor to be filled with uniformly distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
n: Number of choice(s) for integer sampling.
The lowest possible value will be 0, and the highest possible
value will be n - 1.
`n` can be a scalar, or a tensor.
out: Optionally, the tensor to be filled by the random integers.
If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "int64") or a PyTorch dtype
(e.g. torch.int64).
If `dtype` is not specified (and also `out` is None),
`torch.int64` will be used.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
generator: Pseudo-random generator to be used when sampling
the values. Can be a `torch.Generator` or any object with
a `generator` attribute (e.g. a Problem object).
If not given, then this method's parent object will be
analyzed whether or not it has its own generator.
If it does, that generator will be used.
If not, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the uniformly
distributed values.
"""
if (dtype is None) and (out is None):
dtype = torch.int64
args, kwargs = self.__get_all_args_for_random_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
generator=generator,
)
return misc.make_randint(*args, n=n, **kwargs)
make_tensor(self, data, *, dtype=None, device=None, use_eval_dtype=False, read_only=False)
¶
Make a new tensor.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
Any |
The data to be converted to a tensor.
If one wishes to create a PyTorch tensor, this can be anything
that can be stored by a PyTorch tensor.
If one wishes to create an |
required |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32"), or a PyTorch dtype
(e.g. torch.float32), or |
None |
device |
Union[str, torch.device] |
The device in which the tensor will be stored.
If |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
read_only |
bool |
Whether or not the created tensor will be read-only. By default, this is False. |
False |
Returns:
Type | Description |
---|---|
Iterable |
A PyTorch tensor or an ObjectArray. |
Source code in evotorch/tools/tensormaker.py
def make_tensor(
self,
data: Any,
*,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
read_only: bool = False,
) -> Iterable:
"""
Make a new tensor.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
data: The data to be converted to a tensor.
If one wishes to create a PyTorch tensor, this can be anything
that can be stored by a PyTorch tensor.
If one wishes to create an `ObjectArray` and therefore passes
`dtype=object`, then the provided `data` is expected as an
`Iterable`.
dtype: Optionally a string (e.g. "float32"), or a PyTorch dtype
(e.g. torch.float32), or `object` or "object" (as a string)
or `Any` if one wishes to create an `ObjectArray`.
If `dtype` is not specified it will be assumed that the user
wishes to create a tensor using the dtype of this method's
parent object.
device: The device in which the tensor will be stored.
If `device` is not specified, it will be assumed that the user
wishes to create a tensor on the device of this method's
parent object.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
read_only: Whether or not the created tensor will be read-only.
By default, this is False.
Returns:
A PyTorch tensor or an ObjectArray.
"""
kwargs = self.__get_dtype_and_device_kwargs(dtype=dtype, device=device, use_eval_dtype=use_eval_dtype, out=None)
return misc.make_tensor(data, read_only=read_only, **kwargs)
make_uniform(self, *size, *, num_solutions=None, lb=None, ub=None, out=None, dtype=None, device=None, use_eval_dtype=False, generator=None)
¶
Make a new or existing tensor filled by uniformly distributed values. Both lower and upper bounds are inclusive. This function can work with both float and int dtypes.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with uniformly distributed values. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor instead, then no positional argument is expected. |
() |
num_solutions |
Optional[int] |
This can be used instead of the |
None |
lb |
Union[float, Iterable[float], torch.Tensor] |
Lower bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the lower bound will be taken as 0.
Note that, if one specifies |
None |
ub |
Union[float, Iterable[float], torch.Tensor] |
Upper bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the upper bound will be taken as 1.
Note that, if one specifies |
None |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by uniformly distributed
values. If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
generator |
Any |
Pseudo-random generator to be used when sampling
the values. Can be a |
None |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing the uniformly distributed values. |
Source code in evotorch/tools/tensormaker.py
def make_uniform(
self,
*size: Size,
num_solutions: Optional[int] = None,
lb: Optional[RealOrVector] = None,
ub: Optional[RealOrVector] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
generator: Any = None,
) -> torch.Tensor:
"""
Make a new or existing tensor filled by uniformly distributed values.
Both lower and upper bounds are inclusive.
This function can work with both float and int dtypes.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with uniformly distributed
values. This can be given as multiple positional arguments, each
such positional argument being an integer, or as a single
positional argument of a tuple, the tuple containing multiple
integers. Note that, if the user wishes to fill an existing
tensor instead, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
lb: Lower bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the lower bound will be taken as 0.
Note that, if one specifies `lb`, then `ub` is also expected to
be explicitly specified.
ub: Upper bound for the uniformly distributed values.
Can be a scalar, or a tensor.
If not specified, the upper bound will be taken as 1.
Note that, if one specifies `ub`, then `lb` is also expected to
be explicitly specified.
out: Optionally, the tensor to be filled by uniformly distributed
values. If an `out` tensor is given, then no `size` argument is
expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
generator: Pseudo-random generator to be used when sampling
the values. Can be a `torch.Generator` or any object with
a `generator` attribute (e.g. a Problem object).
If not given, then this method's parent object will be
analyzed whether or not it has its own generator.
If it does, that generator will be used.
If not, the global generator of PyTorch will be used.
Returns:
The created or modified tensor after placing the uniformly
distributed values.
"""
args, kwargs = self.__get_all_args_for_random_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
generator=generator,
)
return misc.make_uniform(*args, lb=lb, ub=ub, **kwargs)
make_uniform_shaped_like(self, t, *, lb=None, ub=None)
¶
Make a new uniformly-filled tensor, shaped like the given tensor.
The dtype
and device
will be determined by the parent of this
method (not by the given tensor).
If the parent of this method has its own random generator, then that
generator will be used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
Tensor |
The tensor according to which the result will be shaped. |
required |
lb |
Union[float, Iterable[float], torch.Tensor] |
The inclusive lower bounds for the uniform distribution. Can be a scalar or a tensor. If left as None, 0.0 will be used as the upper bound. |
None |
ub |
Union[float, Iterable[float], torch.Tensor] |
The inclusive upper bounds for the uniform distribution. Can be a scalar or a tensor. If left as None, 1.0 will be used as the upper bound. |
None |
Returns:
Type | Description |
---|---|
Tensor |
A new tensor whose shape is the same with the given tensor. |
Source code in evotorch/tools/tensormaker.py
def make_uniform_shaped_like(
self,
t: torch.Tensor,
*,
lb: Optional[RealOrVector] = None,
ub: Optional[RealOrVector] = None,
) -> torch.Tensor:
"""
Make a new uniformly-filled tensor, shaped like the given tensor.
The `dtype` and `device` will be determined by the parent of this
method (not by the given tensor).
If the parent of this method has its own random generator, then that
generator will be used.
Args:
t: The tensor according to which the result will be shaped.
lb: The inclusive lower bounds for the uniform distribution.
Can be a scalar or a tensor.
If left as None, 0.0 will be used as the upper bound.
ub: The inclusive upper bounds for the uniform distribution.
Can be a scalar or a tensor.
If left as None, 1.0 will be used as the upper bound.
Returns:
A new tensor whose shape is the same with the given tensor.
"""
return self.make_uniform(t.shape, lb=lb, ub=ub)
make_zeros(self, *size, *, num_solutions=None, out=None, dtype=None, device=None, use_eval_dtype=False)
¶
Make a new tensor filled with 0, or fill an existing tensor with 0.
When not explicitly specified via arguments, the dtype and the device of the resulting tensor is determined by this method's parent object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
Union[int, torch.Size] |
Size of the new tensor to be filled with 0. This can be given as multiple positional arguments, each such positional argument being an integer, or as a single positional argument of a tuple, the tuple containing multiple integers. Note that, if the user wishes to fill an existing tensor with 0 values, then no positional argument is expected. |
() |
num_solutions |
Optional[int] |
This can be used instead of the |
None |
out |
Optional[torch.Tensor] |
Optionally, the tensor to be filled by 0 values.
If an |
None |
dtype |
Union[str, torch.dtype, numpy.dtype, Type] |
Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If |
None |
device |
Union[str, torch.device] |
The device in which the new empty tensor will be stored.
If not specified (and also |
None |
use_eval_dtype |
bool |
If this is given as True and a |
False |
Returns:
Type | Description |
---|---|
Tensor |
The created or modified tensor after placing 0 values. |
Source code in evotorch/tools/tensormaker.py
def make_zeros(
self,
*size: Size,
num_solutions: Optional[int] = None,
out: Optional[torch.Tensor] = None,
dtype: Optional[DType] = None,
device: Optional[Device] = None,
use_eval_dtype: bool = False,
) -> torch.Tensor:
"""
Make a new tensor filled with 0, or fill an existing tensor with 0.
When not explicitly specified via arguments, the dtype and the device
of the resulting tensor is determined by this method's parent object.
Args:
size: Size of the new tensor to be filled with 0.
This can be given as multiple positional arguments, each such
positional argument being an integer, or as a single positional
argument of a tuple, the tuple containing multiple integers.
Note that, if the user wishes to fill an existing tensor with
0 values, then no positional argument is expected.
num_solutions: This can be used instead of the `size` arguments
for specifying the shape of the target tensor.
Expected as an integer, when `num_solutions` is specified
as `n`, the shape of the resulting tensor will be
`(n, m)` where `m` is the solution length reported by this
method's parent object's `solution_length` attribute.
out: Optionally, the tensor to be filled by 0 values.
If an `out` tensor is given, then no `size` argument is expected.
dtype: Optionally a string (e.g. "float32") or a PyTorch dtype
(e.g. torch.float32).
If `dtype` is not specified (and also `out` is None),
it will be assumed that the user wishes to create a tensor
using the dtype of this method's parent object.
If an `out` tensor is specified, then `dtype` is expected
as None.
device: The device in which the new empty tensor will be stored.
If not specified (and also `out` is None), it will be
assumed that the user wishes to create a tensor on the
same device with this method's parent object.
If an `out` tensor is specified, then `device` is expected
as None.
use_eval_dtype: If this is given as True and a `dtype` is not
specified, then the `dtype` of the result will be taken
from the `eval_dtype` attribute of this method's parent
object.
Returns:
The created or modified tensor after placing 0 values.
"""
args, kwargs = self.__get_all_args_for_maker(
*size,
num_solutions=num_solutions,
out=out,
dtype=dtype,
device=device,
use_eval_dtype=use_eval_dtype,
)
return misc.make_zeros(*args, **kwargs)