Skip to content

Funcadam

adam(*, center_init, center_learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08)

Initialize an Adam optimizer and return its initial state.

Reference:

Kingma, D. P. and J. Ba (2015).
Adam: A method for stochastic optimization.
In Proceedings of 3rd International Conference on Learning Representations.

Parameters:

Name Type Description Default
center_init BatchableVector

Starting point for the Adam search. Expected as a PyTorch tensor with at least 1 dimension. If there are 2 or more dimensions, the extra leftmost dimensions are interpreted as batch dimensions.

required
center_learning_rate BatchableScalar

Learning rate (i.e. the step size) for the Adam updates. Can be a scalar or a multidimensional tensor. If given as a tensor with multiple dimensions, those dimensions will be interpreted as batch dimensions.

0.001
beta1 BatchableScalar

beta1 hyperparameter for the Adam optimizer. Can be a scalar or a multidimensional tensor. If given as a tensor with multiple dimensions, those dimensions will be interpreted as batch dimensions.

0.9
beta2 BatchableScalar

beta2 hyperparameter for the Adam optimizer. Can be a scalar or a multidimensional tensor. If given as a tensor with multiple dimensions, those dimensions will be interpreted as batch dimensions.

0.999
epsilon BatchableScalar

epsilon hyperparameter for the Adam optimizer. Can be a scalar or a multidimensional tensor. If given as a tensor with multiple dimensions, those dimensions will be interpreted as batch dimensions.

1e-08
Source code in evotorch/algorithms/functional/funcadam.py
def adam(
    *,
    center_init: BatchableVector,
    center_learning_rate: BatchableScalar = 0.001,
    beta1: BatchableScalar = 0.9,
    beta2: BatchableScalar = 0.999,
    epsilon: BatchableScalar = 1e-8,
) -> AdamState:
    """
    Initialize an Adam optimizer and return its initial state.

    Reference:

        Kingma, D. P. and J. Ba (2015).
        Adam: A method for stochastic optimization.
        In Proceedings of 3rd International Conference on Learning Representations.

    Args:
        center_init: Starting point for the Adam search.
            Expected as a PyTorch tensor with at least 1 dimension.
            If there are 2 or more dimensions, the extra leftmost dimensions
            are interpreted as batch dimensions.
        center_learning_rate: Learning rate (i.e. the step size) for the Adam
            updates. Can be a scalar or a multidimensional tensor.
            If given as a tensor with multiple dimensions, those dimensions
            will be interpreted as batch dimensions.
        beta1: beta1 hyperparameter for the Adam optimizer.
            Can be a scalar or a multidimensional tensor.
            If given as a tensor with multiple dimensions, those dimensions
            will be interpreted as batch dimensions.
        beta2: beta2 hyperparameter for the Adam optimizer.
            Can be a scalar or a multidimensional tensor.
            If given as a tensor with multiple dimensions, those dimensions
            will be interpreted as batch dimensions.
        epsilon: epsilon hyperparameter for the Adam optimizer.
            Can be a scalar or a multidimensional tensor.
            If given as a tensor with multiple dimensions, those dimensions
            will be interpreted as batch dimensions.
    Returns:
        A named tuple of type `AdamState`, representing the initial state
        of the Adam optimizer.
    """
    center_init = torch.as_tensor(center_init)
    dtype = center_init.dtype
    device = center_init.device

    def as_tensor(x) -> torch.Tensor:
        return torch.as_tensor(x, dtype=dtype, device=device)

    center_learning_rate = as_tensor(center_learning_rate)
    beta1 = as_tensor(beta1)
    beta2 = as_tensor(beta2)
    epsilon = as_tensor(epsilon)

    m = torch.zeros_like(center_init)
    v = torch.zeros_like(center_init)
    t = torch.zeros(center_init.shape[:-1], dtype=dtype, device=device)

    return AdamState(
        center=center_init,
        center_learning_rate=center_learning_rate,
        beta1=beta1,
        beta2=beta2,
        epsilon=epsilon,
        m=m,
        v=v,
        t=t,
    )

adam_ask(state)

Get the search point stored by the given AdamState.

Parameters:

Name Type Description Default
state AdamState

The current state of the Adam optimizer.

required
Source code in evotorch/algorithms/functional/funcadam.py
def adam_ask(state: AdamState) -> torch.Tensor:
    """
    Get the search point stored by the given `AdamState`.

    Args:
        state: The current state of the Adam optimizer.
    Returns:
        The search point as a 1-dimensional tensor in the non-batched case,
        or as a multi-dimensional tensor if the Adam search is batched.
    """
    return state.center

adam_tell(state, *, follow_grad)

Tell the Adam optimizer the current gradient to get its next state.

Parameters:

Name Type Description Default
state AdamState

The current state of the Adam optimizer.

required
follow_grad BatchableVector

Gradient at the current point of the Adam search. Can be a 1-dimensional tensor in the non-batched case, or a multi-dimensional tensor in the batched case.

required
Source code in evotorch/algorithms/functional/funcadam.py
def adam_tell(state: AdamState, *, follow_grad: BatchableVector) -> AdamState:
    """
    Tell the Adam optimizer the current gradient to get its next state.

    Args:
        state: The current state of the Adam optimizer.
        follow_grad: Gradient at the current point of the Adam search.
            Can be a 1-dimensional tensor in the non-batched case,
            or a multi-dimensional tensor in the batched case.
    Returns:
        The updated state of Adam with the given gradient applied.
    """
    new_center, new_m, new_v, new_t = _adam_step(
        follow_grad,
        state.center,
        state.center_learning_rate,
        state.beta1,
        state.beta2,
        state.epsilon,
        state.m,
        state.v,
        state.t,
    )

    return AdamState(
        center=new_center,
        center_learning_rate=state.center_learning_rate,
        beta1=state.beta1,
        beta2=state.beta2,
        epsilon=state.epsilon,
        m=new_m,
        v=new_v,
        t=new_t,
    )