flambe.nn.embedding

Module Contents

class flambe.nn.embedding.Embeddings(num_embeddings: int, embedding_dim: int, padding_idx: int = 0, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False, positional_encoding: bool = False, positional_learned: bool = False, positonal_max_length: int = 5000)[source]

Bases: flambe.nn.module.Module

Implement an Embeddings module.

This object replicates the usage of nn.Embedding but registers the from_pretrained classmethod to be used inside a Flambé configuration, as this does not happen automatically during the registration of PyTorch objects.

The module also adds optional positional encoding, which can either be sinusoidal or learned during training. For the non-learned positional embeddings, we use sine and cosine functions of different frequencies.

\[ext{PosEncoder}(pos, 2i) = sin(pos/10000^(2i/d_model)) ext{PosEncoder}(pos, 2i+1) = cos(pos/10000^(2i/d_model)) ext{where pos is the word position and i is the embed idx)\]
classmethod from_pretrained(cls, embeddings: Tensor, freeze: bool = True, padding_idx: int = 0, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False, positional_encoding: bool = False, positional_learned: bool = False, positonal_max_length: int = 5000, positonal_embeddings: Optional[Tensor] = None, positonal_freeze: bool = True)[source]

Create an Embeddings instance from pretrained embeddings.

Parameters:
  • embeddings (torch.Tensor) – FloatTensor containing weights for the Embedding. First dimension is being passed to Embedding as num_embeddings, second as embedding_dim.
  • freeze (bool) – If True, the tensor does not get updated in the learning process. Default: True
  • padding_idx (int, optional) – Pads the output with the embedding vector at padding_idx (initialized to zeros) whenever it encounters the index, by default 0
  • max_norm (Optional[float], optional) – If given, each embedding vector with norm larger than max_norm is normalized to have norm max_norm
  • norm_type (float, optional) – The p of the p-norm to compute for the max_norm option. Default 2.
  • scale_grad_by_freq (bool, optional) – If given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default False.
  • sparse (bool, optional) – If True, gradient w.r.t. weight matrix will be a sparse tensor. See Notes for more details.
  • positional_encoding (bool, optional) – If True, adds positonal encoding to the token embeddings. By default, the embeddings are frozen sinusodial embeddings. To learn these during training, set positional_learned. Default False.
  • positional_learned (bool, optional) – Learns the positional embeddings during training instead of using frozen sinusodial ones. Default False.
  • positonal_embeddings (torch.Tensor, optional) – If given, also replaces the positonal embeddings with this matrix. The max length will be ignored and replaced by the dimension of this matrix.
  • positonal_freeze (bool, optional) – Whether the positonal embeddings should be frozen
forward(self, data: Tensor)[source]

Perform a forward pass.

Parameters:data (Tensor) – The input tensor of shape [S x B]
Returns:The output tensor of shape [S x B x E]
Return type:Tensor
class flambe.nn.embedding.Embedder(embedding: Module, encoder: Module, pooling: Optional[Module] = None, embedding_dropout: float = 0, padding_idx: Optional[int] = 0, return_mask: bool = False)[source]

Bases: flambe.nn.module.Module

Implements an Embedder module.

An Embedder takes as input a sequence of index tokens, and computes the corresponding embedded representations, and padding mask. The encoder may be initialized using a pretrained embedding matrix.

embeddings

The embedding module

Type:Module
encoder

The sub-encoder that this object is wrapping

Type:Module
pooling

An optional pooling module

Type:Module
drop

The dropout layer

Type:nn.Dropout
forward(self, data: Tensor)[source]

Performs a forward pass through the network.

Parameters:data (torch.Tensor) – The input data, as a float tensor of shape [S x B]
Returns:Tuple[Tuple[Tensor, Tensor], Tensor] The encoded output, as a float tensor. May return a state if the encoder is an RNN and no pooling is provided. May also return a tuple if return_mask was passed in as a constructor argument.
Return type:Union[Tensor, Tuple[Tensor, Tensor],