drytorch.lib.models

Module containing classes for wrapping a torch module and its optimizer.

Functions

count_params(params)

Count the number of parameters.

Classes

Model(module[, name, device, checkpoint, ...])

Wrapper for a torch.nn.Module class with extra information.

ModelAverage(torch_module, ...)

Bundle a torch.nn.Module and a torch.optim.swa_utils.AveragedModel.

ModelOptimizer(model, learning_schema)

Bundle the module and its optimizer.

class Model(module: ModuleProtocol[Input, Output], name: str = '', device: device | None = None, checkpoint: CheckpointProtocol | None = None, mixed_precision: bool = False, should_compile: bool = True, should_distribute: bool = True)[source]

Bases: CreatedAtMixin, ModelProtocol[Input, Output]

Wrapper for a torch.nn.Module class with extra information.

module

Pytorch module to optimize.

Type:

torch.nn.modules.module.Module

epoch

the number of epochs the model has been trained so far.

Type:

int

mixed_precision

whether to use mixed precision computing.

Type:

bool

checkpoint

checkpoint manager.

Type:

drytorch.core.protocols.CheckpointProtocol

Initialize.

Option should_distribute assumes that there is a single accelerator for each process and that the device for the process is already set.

Parameters:
  • module (Module) – Pytorch module with type annotations.

  • name (str) – the name of the model. Default uses the class name.

  • device (torch.device | None) – the device where to store the weights of the module. Default uses the accelerator if available, cpu otherwise.

  • checkpoint (CheckpointProtocol) – class that saves the state and optionally the optimizer.

  • mixed_precision (bool) – whether to use mixed precision computing.

  • should_compile (bool) – compile the module at instantiation (Python < 3.14).

  • should_distribute (bool) – wrap the module for data-distributed settings.

__call__(inputs: Input) Output[source]

Execute forward pass.

Parameters:

inputs (Input)

Return type:

Output

__del__()[source]

Unregister from the registry when deleted/garbage-collected.

property device: device

The device where the weights are stored.

property name: str

The name of the model.

prepare_module(module: Module) Module[source]

Compile and distribute the module.

Parameters:

module (Module)

Return type:

Module

increment_epoch() None[source]

Increment the epoch by 1.

Return type:

None

load_state(epoch=-1) None[source]

Load the weights and epoch of the model.

Return type:

None

register() None[source]

Register to the registry.

Return type:

None

save_state() None[source]

Save the weights and epoch of the model.

Return type:

None

unregister() None[source]

Unregister from the registry.

Return type:

None

update_parameters() None[source]

Update the parameters of the model.

Return type:

None

class ModelAverage(torch_module: ~drytorch.core.protocols.ModuleProtocol[~drytorch.lib.models.Input, ~drytorch.lib.models.Output], /, name: str = '', device: ~torch.device | None = None, checkpoint: ~drytorch.core.protocols.CheckpointProtocol = <drytorch.lib.checkpoints.LocalCheckpoint object>, mixed_precision: bool = False, avg_fn: ~collections.abc.Callable[[~torch.Tensor, ~torch.Tensor, ~torch.Tensor | int], ~torch.Tensor] | None = None, multi_avg_fn: ~collections.abc.Callable[[tuple[~torch.Tensor, ...] | list[~torch.Tensor], tuple[~torch.Tensor, ...] | list[~torch.Tensor], ~torch.Tensor | int], None] | None = None, use_buffers: bool = False)[source]

Bases: Model[Input, Output]

Bundle a torch.nn.Module and a torch.optim.swa_utils.AveragedModel.

Use the averaged model when in inference mode.

averaged_module

the averaged module.

Type:

torch.optim.swa_utils.AveragedModel

Initialize.

Parameters:
  • torch_module (p.ModuleProtocol[Input, Output]) – Pytorch module with type annotations.

  • name (str) – the name of the model. Default uses the class name.

  • device (torch.device | None) – the device where to store the weights of the module. Default uses cuda when available, cpu otherwise.

  • checkpoint (CheckpointProtocol) – class that saves the state and optionally the optimizer.

  • mixed_precision (bool) – whether to use mixed precision computing. Defaults to False.

  • avg_fn (Callable[[Tensor, Tensor, Tensor | int], Tensor] | None) – see docs at torch.optim.swa_utils.AveragedModel.

  • multi_avg_fn (Callable[[ParamList, ParamList, Tensor | int], None] | None) – see docs at torch.optim.swa_utils.AveragedModel.

  • use_buffers (bool) – see docs at torch.optim.swa_utils.AveragedModel.

__call__(inputs: Input) Output[source]

Execute the forward pass.

Parameters:

inputs (Input)

Return type:

Output

update_parameters() None[source]

Update the parameters of the model.

Return type:

None

class ModelOptimizer(model: ModelProtocol[Input, Output], learning_schema: LearningProtocol)[source]

Bases: object

Bundle the module and its optimizer.

It supports different learning rates to separate parameters’ groups.

Initialize.

Parameters:
  • model (p.ModelProtocol[Input, Output]) – the model to be optimized.

  • learning_schema (p.LearningProtocol) – the learning scheme for the optimizer.

property base_lr: float | dict[str, float]

Learning rate(s) for the module parameters.

Raises:

MissingParamError – if parameters are missing from the dictionary.

get_opt_params() list[_OptParams][source]

Actual learning rates for each parameter updated according.

Return type:

list[_OptParams]

get_scheduled_lr(lr: float) float[source]

Update the base learning rate according to the scheduler.

Parameters:

lr (float) – base learning rate.

Return type:

float

load(epoch: int = -1) None[source]

Load model and optimizer state from a checkpoint.

Parameters:

epoch (int)

Return type:

None

update_learning_rate(base_lr: float | dict[str, float] | None = None, scheduler: SchedulerProtocol | None = None) None[source]

Recalculate the learning rates for the current epoch.

It updates the learning rates for each parameter’s group in the optimizer based on input learning rate(s) and scheduler.

Parameters:
  • base_lr (float | dict[str, float] | None) – initial learning rates for named parameters or global value. Default keeps the original learning rates.

  • scheduler (SchedulerProtocol | None) – scheduler for the learning rates. Default keeps the original scheduler.

Return type:

None

optimize(loss_value: Tensor)[source]

Optimize the model backpropagating the loss value.

Parameters:

loss_value (Tensor) – the output tensor for the loss.

save() None[source]

Save model and optimizer state in a checkpoint.

Return type:

None