vp_suite.datasets.mmnist
- class MovingMNISTDataset(split, **dataset_kwargs)
Bases:
vp_suite.base.base_dataset.VPDataset
Dataset class for the dataset “Moving MNIST”, as firstly encountered in “Unsupervised Learning of Video Representations using LSTMs” by Srivastava et al. (https://arxiv.org/pdf/1502.04681v3.pdf).
Each sequence depicts two digits from the MNIST dataset moving linearly in front of a black background, occasionally bouncing off the wall and overlapping each other.
For downloading and preparing the dataset, scripts have been developed by Tencia Lee, ported to python 3 by Praateek Mahajan (https://gist.github.com/praateekmahajan/b42ef0d295f528c986e2b3a0b31ec1fe) and further modified here.
- ACTION_SIZE: int = 0
The size of the action vector per frame (If the dataset provides no actions, this value is 0).
- DATASET_FRAME_SHAPE: (<class 'int'>, <class 'int'>, <class 'int'>) = (64, 64, 3)
Shape of a single frame in the dataset (height, width, channels).
- DEFAULT_DATA_DIR: pathlib.Path = PosixPath('/home/runner/work/vp-suite/vp-suite/vp-suite-data/data/moving_mnist')
The default save location of the dataset files.
- REFERENCE: str = 'https://arxiv.org/abs/1502.04681v3'
The reference (publication) where the original dataset is introduced.
- __init__(split, **dataset_kwargs)
Initializes the dataset loader by determining its split and extracting and processing all dataset attributes from the parameters given in dataset_kwargs.
- Parameters
split (str) – The dataset’s split identifier (i.e. whether it’s a training/validation/test dataset)
**dataset_kwargs (Any) – Optional dataset arguments for image transformation, value_range, splitting etc.
- download_and_prepare_dataset()
Downloads the specific dataset, prepares it for the video prediction task (if needed) and stores it in a default location in the ‘data/’ folder. Implemented by the derived dataset classes.
- arr_from_img(im, mean=0, std=1)
Convert image to array.
- generate_moving_mnist(d_path, training, shape, num_frames, num_images, digit_size, digits_per_image)
Generate sequences of moving MNIST digits by moving them around between frames.
- Parameters
training (bool) – Used to decide if downloading/generating training set or test set.
shape ((int, int)) – Shape we want for our moving images (new_width and new_height).
num_frames (int) – Number of frames in a particular movement/animation/gif.
num_images (int) – Number of movement/animations/gif to generate.
digit_size (int) – Real size of the images (eg: MNIST is 28x28).
digits_per_image (int) – Digits per movement/animation/gif.
- Returns
Dataset of np.uint8 type with dimensions num_frames * num_images x 1 x new_width x new_height
- img_from_arr(arr, index, mean=0, std=1)
Convert array to image.
- load_dataset(d_path, training, digit_size)
Loads MNIST from the web on demand.
- Parameters
Returns: The loaded MNIST images.
- save_generated_mmnist(data, seqs, frame_size, out_path)
Save generated data per-sequence to specified out path.