_bfs(obs: List, obs_idx: int) → Tuple[Dict[int, List], Set[Tuple[int, ...]]]¶
Given a single obs, itself a nested list, run BFS.
This function enumerates:
- The lengths of each of the intermediary lists, by depth
- All paths to the child nodes
- obs (List) – A nested list of lists of arbitrary depth, with the child nodes, i.e. deepest list elements, as `torch.Tensor`s
- obs_idx (int) – The index of obs in the batch.
- Set[Tuple[int]] – A set of all distinct paths to all children
- Dict[int, List[int]] – A map containing the lengths of all intermediary lists, by depth
_batch_from_nested_col(col: Tuple, pad: int) → torch.Tensor¶
Compose a batch padded to the max-size along each dimension.
Parameters: col (List) –
A nested list of lists of arbitrary depth, with the child nodes, i.e. deepest list elements, as `torch.Tensor`s
For example, a col might be:
- [torch.Tensor([1, 2]), torch.Tensor([3, 4, 5])],
[torch.Tensor([5, 6, 7]), torch.Tensor([4, 5]),torch.Tensor([5, 6, 7, 8])]
Level 1 sizes: [2, 3] Level 2 sizes: [2, 3]; [3, 2, 4]
The max-sizes along each dimension are:
- Dim 1: 3
- Dim 2: 4
As such, since this column contains 2 elements, with max-sizes 3 and 4 along the nested dimensions, our resulting batch would have size (4, 3, 2), and the padded `Tensor`s would be inserted at their respective locations.
Returns: A (n+1)-dimensional torch.Tensor, where n is the nesting depth, padded to the max-size along each dimension Return type: torch.Tensor
collate_fn(data: List[Tuple[torch.Tensor, ...]], pad: int) → Tuple[torch.Tensor, ...]¶
Turn a list of examples into a mini-batch.
Handles padding on the fly on simple sequences, as well as nested sequences.
- data (List[Tuple[torch.Tensor, ..]]) – The list of sampled examples. Each example is a tuple, each dimension representing a column from the original dataset
- pad (int) – The padding index
The output batch of tensors
BaseSampler(batch_size: int = 64, shuffle: bool = True, pad_index: Union[int, Sequence[int]] = 0, n_workers: int = 0, pin_memory: bool = False, seed: Optional[int] = None, downsample: Optional[float] = None, downsample_seed: Optional[int] = None, drop_last: bool = False)¶
Implements a BaseSampler object.
This is the most basic implementation of a sampler. It uses Pytorch’s DataLoader object internally, and offers the possiblity to override the sampling of the examples and how to from a batch from them.
sample(self, data: Sequence[Sequence[torch.Tensor]], n_epochs: int = 1)¶
Sample from the list of features and yields batches.
- data (Sequence[Sequence[torch.Tensor, ..]]) – The input data to sample from
- n_epochs (int, optional) – The number of epochs to run in the output iterator. Use -1 to run infinitely.
Iterator[Tuple[Tensor]] – A batch of data, as a tuple of Tensors
length(self, data: Sequence[Sequence[torch.Tensor]])¶
Return the number of batches in the sampler.
Parameters: data (Sequence[Sequence[torch.Tensor, ..]]) – The input data to sample from Returns: The number of batches that would be created per epoch Return type: int