flambe.optim.noam
¶
Module Contents¶
-
class
flambe.optim.noam.
NoamScheduler
(optimizer, warmup: int, d_model: int)[source]¶ Bases:
flambe.optim.scheduler.LambdaLR
Linear warmup and then quadratic decay.
Linearly increases the learning rate from 0 to 1 over warmup steps. Quadratically decreases the learning rate after.
This scheduler is generally used after every training batch.