flambe.nlp.language_modeling.sampler
¶
Module Contents¶
-
class
flambe.nlp.language_modeling.sampler.
CorpusSampler
(batch_size: int = 128, unroll_size: int = 128, n_workers: int = 0, pin_memory: bool = False, downsample: Optional[float] = None, drop_last: bool = True)[source]¶ Bases:
flambe.sampler.sampler.Sampler
Implement a CorpusSampler object.
This object is useful for iteration over a large corpus of text in an ordered way. It takes as input a dataset with a single example containing the sequence of tokens.
-
static
collate_fn
(data: Sequence[Tuple[Tensor, Tensor]])[source]¶ Create a batch from data.
Parameters: data (Sequence[Tuple[Tensor, Tensor]]) – List of (source, target) tuples. Returns: Source and target Tensors. Return type: Tuple[Tensor, Tensor]
-
sample
(self, data: Sequence[Sequence[Tensor]], n_epochs: int = 1)[source]¶ Sample from the list of features and yields batches.
Parameters: - data (Sequence[Sequence[Tensor, ..]]) – The input data to sample from
- n_epochs (int, optional) – The number of epochs to run in the output iterator. Use -1 to run infinitely.
Yields: Iterator[Tuple[Tensor]] – A batch of data, as a tuple of Tensors
-
static