flambe.optim

Package Contents

class flambe.optim.LRScheduler[source]

Bases: torch.optim.lr_scheduler._LRScheduler, flambe.compile.Component

state_dict(self)
class flambe.optim.LambdaLR[source]

Bases: torch.optim.lr_scheduler.LambdaLR, flambe.compile.Component

state_dict(self)
class flambe.optim.NoamScheduler(optimizer, warmup: int, d_model: int)[source]

Bases: flambe.optim.scheduler.LambdaLR

Linear warmup and then quadratic decay.

Linearly increases the learning rate from 0 to 1 over warmup steps. Quadratically decreases the learning rate after.

This scheduler is generally used after every training batch.

lr_lambda(self, step: int)

Compue the learning rate factor.

Parameters:step (int) – The current step. Could be training over validation steps.
Returns:The output factor
Return type:float
class flambe.optim.WarmupLinearScheduler(optimizer, warmup: int, n_steps: int)[source]

Bases: flambe.optim.scheduler.LambdaLR

Linear warmup and then linear decay.

Linearly increases learning rate from 0 to 1 over warmup training steps. Linearly decreases learning rate from 1. to 0. over remaining n_steps - warmup steps.

This scheduler is generally used after every training batch.

lr_lambda(self, step: int)

Compue the learning rate factor.

Parameters:step (int) – The current step. Could be training over validation steps.
Returns:The output factor
Return type:float
class flambe.optim.RAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, degenerated_to_sgd=True)[source]

Bases: torch.optim.optimizer.Optimizer, flambe.compile.Component

Rectified Adam optimizer.

Taken from https://github.com/LiyuanLucasLiu/RAdam.

__setstate__(self, state)
state_dict(self)
step(self, closure=None)