flambe.sampler.base

Module Contents

flambe.sampler.base._bfs(obs: List, obs_idx: int) → Tuple[Dict[int, List], Set[Tuple[int, ...]]][source]

Given a single obs, itself a nested list, run BFS.

This function enumerates:

  1. The lengths of each of the intermediary lists, by depth
  2. All paths to the child nodes
Parameters:
  • obs (List) – A nested list of lists of arbitrary depth, with the child nodes, i.e. deepest list elements, as `torch.Tensor`s
  • obs_idx (int) – The index of obs in the batch.
Returns:

  • Set[Tuple[int]] – A set of all distinct paths to all children
  • Dict[int, List[int]] – A map containing the lengths of all intermediary lists, by depth

flambe.sampler.base._batch_from_nested_col(col: Tuple, pad: int) → torch.Tensor[source]

Compose a batch padded to the max-size along each dimension.

Parameters:col (List) –

A nested list of lists of arbitrary depth, with the child nodes, i.e. deepest list elements, as `torch.Tensor`s

For example, a col might be:

[
[torch.Tensor([1, 2]), torch.Tensor([3, 4, 5])], [torch.Tensor([5, 6, 7]), torch.Tensor([4, 5]),
torch.Tensor([5, 6, 7, 8])]

]

Level 1 sizes: [2, 3] Level 2 sizes: [2, 3]; [3, 2, 4]

The max-sizes along each dimension are:

  • Dim 1: 3
  • Dim 2: 4

As such, since this column contains 2 elements, with max-sizes 3 and 4 along the nested dimensions, our resulting batch would have size (4, 3, 2), and the padded `Tensor`s would be inserted at their respective locations.

Returns:A (n+1)-dimensional torch.Tensor, where n is the nesting depth, padded to the max-size along each dimension
Return type:torch.Tensor
flambe.sampler.base.collate_fn(data: List[Tuple[torch.Tensor, ...]], pad: int) → Tuple[torch.Tensor, ...][source]

Turn a list of examples into a mini-batch.

Handles padding on the fly on simple sequences, as well as nested sequences.

Parameters:
  • data (List[Tuple[torch.Tensor, ..]]) – The list of sampled examples. Each example is a tuple, each dimension representing a column from the original dataset
  • pad (int) – The padding index
Returns:

The output batch of tensors

Return type:

Tuple[torch.Tensor, ..]

class flambe.sampler.base.BaseSampler(batch_size: int = 64, shuffle: bool = True, pad_index: Union[int, Sequence[int]] = 0, n_workers: int = 0, pin_memory: bool = False, seed: Optional[int] = None, downsample: Optional[float] = None, downsample_max_samples: Optional[int] = None, downsample_seed: Optional[int] = None, drop_last: bool = False)[source]

Bases: flambe.sampler.sampler.Sampler

Implements a BaseSampler object.

This is the most basic implementation of a sampler. It uses Pytorch’s DataLoader object internally, and offers the possiblity to override the sampling of the examples and how to from a batch from them.

sample(self, data: Sequence[Sequence[torch.Tensor]], n_epochs: int = 1)[source]

Sample from the list of features and yields batches.

Parameters:
  • data (Sequence[Sequence[torch.Tensor, ..]]) – The input data to sample from
  • n_epochs (int, optional) – The number of epochs to run in the output iterator. Use -1 to run infinitely.
Yields:

Iterator[Tuple[Tensor]] – A batch of data, as a tuple of Tensors

length(self, data: Sequence[Sequence[torch.Tensor]])[source]

Return the number of batches in the sampler.

Parameters:data (Sequence[Sequence[torch.Tensor, ..]]) – The input data to sample from
Returns:The number of batches that would be created per epoch
Return type:int