vp_suite.base.base_dataset
- class VPData(*args, **kwargs)
Bases:
dict
This template class defines the return type for all datasets.
- actions: torch.Tensor
torch tensors of shape [t, a].
- Type
Actions per frame
- frames: torch.Tensor
torch tensors of shape [t, c, h, w].
- Type
Video frames
- class VPDataset(split, **dataset_kwargs)
Bases:
torch.utils.data.dataset.Dataset
The base class for all video prediction dataset loaders. Data points are provided in the shape of
VPData
dicts.Note
VPDataset objects are not usable directly after creation since the sequence length is unspecified. In order to fully prepare the dataset,
self.set_seq_len()
has to be called with the desired amount of frames and the seq_step. Afterwards, the VPDataset object. is ready to be queried for data.- ACTION_SIZE: int = NotImplemented
The size of the action vector per frame (If the dataset provides no actions, this value is 0).
- DATASET_FRAME_SHAPE: (<class 'int'>, <class 'int'>, <class 'int'>) = NotImplemented
Shape of a single frame in the dataset (height, width, channels).
- DEFAULT_DATA_DIR: pathlib.Path = NotImplemented
The default save location of the dataset files.
- NON_CONFIG_VARS = ['functions', 'ready_for_usage', 'total_frames', 'seq_len', 'frame_offsets', 'data_dir']
Variables that do not get included in the dict returned by
self.config()
(Constants are not included either).
- ON_THE_FLY: bool = False
If true, accessing the dataset means data is generated on the fly rather than fetched from storage.
- VALID_SPLITS = ['train', 'test']
The valid arguments for specifying splits.
- __init__(split, **dataset_kwargs)
Initializes the dataset loader by determining its split and extracting and processing all dataset attributes from the parameters given in dataset_kwargs.
- Parameters
split (str) – The dataset’s split identifier (i.e. whether it’s a training/validation/test dataset)
**dataset_kwargs (Any) – Optional dataset arguments for image transformation, value_range, splitting etc.
- property config: dict
A dictionary containing the complete dataset configuration, including common attributes as well as dataset-specific attributes.
- Type
Returns
- Return type
- default_available(split, **dataset_kwargs)
Tries to load a dataset and a datapoint using the default
self.data_dir
value. If this succeeds, then we can safely use the default data dir, otherwise a new dataset has to be downloaded and prepared.- Parameters
split (str) – The dataset’s split identifier (i.e. whether it’s a training/validation/test dataset).
**dataset_kwargs (Any) – Optional dataset arguments for image transformation, value_range, splitting etc.
Returns: True if we could load the dataset using default values, False otherwise.
- classmethod download_and_prepare_dataset()
Downloads the specific dataset, prepares it for the video prediction task (if needed) and stores it in a default location in the ‘data/’ folder. Implemented by the derived dataset classes.
- classmethod get_test(**dataset_kwargs)
A wrapper method that creates a test dataset from the given dataset class. Like when initializing such datasets directly, optional dataset arguments can be specified with **dataset_kwargs.
- Parameters
**dataset_kwargs (Any) – optional dataset arguments for image transformation, value_range, splitting etc.
Returns: The created test dataset of the same class.
- classmethod get_train_val(**dataset_kwargs)
A wrapper method that creates a training and a validation dataset from the given dataset class. Like when initializing such datasets directly, optional dataset arguments can be specified with **dataset_kwargs.
- Parameters
**dataset_kwargs (Any) – Optional dataset arguments for image transformation, value_range, splitting etc.
Returns: The created training and validation dataset of the same class.
- img_shape: (<class 'int'>, <class 'int'>, <class 'int'>) = NotImplemented
Shape of a single frame as returned by __getitem()__.
- postprocess(x)
Converts a normalized tensor of an image to a denormalized numpy array. Output: np.uint8, shape: […, h, w, c], range: [0, 255]
- Parameters
x (torch.Tensor) – Input tensor of shape […, c, h, w] and (approx.) range [min_val, max_val].
Returns: A post-processed (quantized) sequence array ready for display.
- Return type
ndarray
- preprocess(x, transform=True)
Preprocesses the input sequence to make it usable by the video prediction models. Makes use of the transformations defined in
self.__init__()
. Workflow is as follows:Convert to torch tensor of type torch.float.
Permute axes to obtain the following shape: [frames/time (t), channels (c), height (h), width (w)].
Scale values to the interval defined by
self.value_range_min
andself.value_range_max
.Crop the image (if applicable).
Resize the image (if applicable).
Perform further data augmentation operations (if applicable).
- Parameters
x (Union[np.ndarray, torch.Tensor]) – The input sequence.
transform (bool) – Whether to crop/resize/augment the sequence using the dataset’s transformations.
Returns: The preprocessed sequence tensor.
- Return type
Tensor
- reset_rng()
Optional logic for resetting the RNG of a dataset.
- set_seq_len(context_frames, pred_frames, seq_step)
Set the sequence length for the upcoming run. Assumes that the given parameters lead to a sequence length that does not exceed the minimum sequence length specified in
self.MIN_SEQ_LEN
.
- split: str = None
The dataset’s split identifier (i.e. whether it’s a training/validation/test dataset).
- train_to_val_ratio: float = 0.8
The ratio of files that will be training data (rest will be validation data). For bigger datasets, this ratio can be set closer to 1.
- train_val_seed = 1234
Random seed used to separate training and validation data.
- transform: torch.nn.modules.module.Module = None
This module gets called in the preprocessing step and consists of pre-specified cropping, resizing and augmentation layers.