drytorch.lib.learning

Module containing classes with learning algorithm’s specifications.

Classes

LearningSchema(optimizer_cls, base_lr, ...)

Class with specifications for the learning algorithm.

class LearningSchema(optimizer_cls: type[~torch.optim.optimizer.Optimizer], base_lr: float | dict[str, float], scheduler: ~drytorch.core.protocols.SchedulerProtocol = ConstantScheduler(), optimizer_defaults: dict[str, ~typing.Any] = <factory>, gradient_op: ~drytorch.core.protocols.GradientOpProtocol = <drytorch.lib.gradient_ops.NoOp object>)[source]

Bases: LearningProtocol

Class with specifications for the learning algorithm.

Parameters:
optimizer_cls

the optimizer class to bind to the module.

Type:

type[torch.optim.optimizer.Optimizer]

base_lr

initial learning rates for named parameters or global value.

Type:

float | dict[str, float]

optimizer_defaults

optional arguments for the optimizer.

Type:

dict[str, Any]

scheduler

modifies the learning rate given the current epoch.

Type:

drytorch.core.protocols.SchedulerProtocol

gradient_op

modifies parameters’ after backward propagation.

Type:

drytorch.core.protocols.GradientOpProtocol

classmethod adam(base_lr: float = 0.001, betas: tuple[float, float]=(0.9, 0.999), scheduler: SchedulerProtocol = ConstantScheduler(), gradient_op: GradientOpProtocol = <drytorch.lib.gradient_ops.NoOp object>) LearningSchema[source]

Convenience method for the Adam optimizer.

Parameters:
  • base_lr (float) – initial learning rate.

  • betas (tuple[float, float]) – coefficients used for computing running averages.

  • scheduler (SchedulerProtocol) – modifies the learning rate given the current epoch.

  • gradient_op (GradientOpProtocol) – modifies parameters’ after backward propagation.

Return type:

LearningSchema

classmethod adam_w(base_lr: float = 0.001, betas: tuple[float, float]=(0.9, 0.999), weight_decay: float = 0.01, scheduler: SchedulerProtocol = ConstantScheduler(), gradient_op: GradientOpProtocol = <drytorch.lib.gradient_ops.NoOp object>) LearningSchema[source]

Convenience method for the AdamW optimizer.

Parameters:
  • base_lr (float) – initial learning rate.

  • betas (tuple[float, float]) – coefficients used for computing running averages.

  • weight_decay (float) – weight decay (L2 penalty).

  • scheduler (SchedulerProtocol) – modifies the learning rate given the current epoch.

  • gradient_op (GradientOpProtocol) – modifies parameters’ after backward propagation.

Return type:

LearningSchema

classmethod sgd(base_lr: float = 0.01, momentum: float = 0.0, weight_decay: float = 0.0, dampening: float = 0.0, nesterov: bool = False, scheduler: SchedulerProtocol = ConstantScheduler(), gradient_op: GradientOpProtocol = <drytorch.lib.gradient_ops.NoOp object>) LearningSchema[source]

Convenience method for the SGD optimizer.

Parameters:
  • base_lr (float) – initial learning rate.

  • momentum (float) – momentum factor.

  • dampening (float) – dampening for momentum.

  • weight_decay (float) – weight decay (L2 penalty).

  • nesterov (bool) – enables Nesterov momentum.

  • scheduler (SchedulerProtocol) – modifies the learning rate given the current epoch.

  • gradient_op (GradientOpProtocol) – modifies parameters’ after backward propagation.

Return type:

LearningSchema

classmethod r_adam(base_lr: float = 0.001, betas: tuple[float, float]=(0.9, 0.999), weight_decay: float = 0.0, scheduler: SchedulerProtocol = ConstantScheduler(), gradient_op: GradientOpProtocol = <drytorch.lib.gradient_ops.NoOp object>) LearningSchema[source]

Convenience method for the RAdam optimizer.

Parameters:
  • base_lr (float) – initial learning rate.

  • betas (tuple[float, float]) – coefficients used for computing running averages.

  • weight_decay (float) – weight decay (L2 penalty).

  • scheduler (SchedulerProtocol) – modifies the learning rate given the current epoch.

  • gradient_op (GradientOpProtocol) – modifies parameters’ after backward propagation.

Return type:

LearningSchema