flambe.optim.noam

Module Contents

class flambe.optim.noam.NoamScheduler(optimizer, warmup: int, d_model: int)[source]

Bases: flambe.optim.scheduler.LambdaLR

Linear warmup and then quadratic decay.

Linearly increases the learning rate from 0 to 1 over warmup steps. Quadratically decreases the learning rate after.

This scheduler is generally used after every training batch.

lr_lambda(self, step: int)[source]

Compue the learning rate factor.

Parameters:step (int) – The current step. Could be training over validation steps.
Returns:The output factor
Return type:float