vp_suite.datasets.mmnist

class MovingMNISTDataset(split, **dataset_kwargs)

Bases: vp_suite.base.base_dataset.VPDataset

Dataset class for the dataset “Moving MNIST”, as firstly encountered in “Unsupervised Learning of Video Representations using LSTMs” by Srivastava et al. (https://arxiv.org/pdf/1502.04681v3.pdf).

Each sequence depicts two digits from the MNIST dataset moving linearly in front of a black background, occasionally bouncing off the wall and overlapping each other.

For downloading and preparing the dataset, scripts have been developed by Tencia Lee, ported to python 3 by Praateek Mahajan (https://gist.github.com/praateekmahajan/b42ef0d295f528c986e2b3a0b31ec1fe) and further modified here.

ACTION_SIZE: int = 0

The size of the action vector per frame (If the dataset provides no actions, this value is 0).

DATASET_FRAME_SHAPE: (<class 'int'>, <class 'int'>, <class 'int'>) = (64, 64, 3)

Shape of a single frame in the dataset (height, width, channels).

DEFAULT_DATA_DIR: pathlib.Path = PosixPath('/home/runner/work/vp-suite/vp-suite/vp-suite-data/data/moving_mnist')

The default save location of the dataset files.

IS_DOWNLOADABLE: str = 'Yes'

A string identifying whether the dataset can be (freely) downloaded.

NAME: str = 'Moving MNIST'

The dataset’s name.

REFERENCE: str = 'https://arxiv.org/abs/1502.04681v3'

The reference (publication) where the original dataset is introduced.

__init__(split, **dataset_kwargs)

Initializes the dataset loader by determining its split and extracting and processing all dataset attributes from the parameters given in dataset_kwargs.

Parameters
  • split (str) – The dataset’s split identifier (i.e. whether it’s a training/validation/test dataset)

  • **dataset_kwargs (Any) – Optional dataset arguments for image transformation, value_range, splitting etc.

download_and_prepare_dataset()

Downloads the specific dataset, prepares it for the video prediction task (if needed) and stores it in a default location in the ‘data/’ folder. Implemented by the derived dataset classes.

train_to_val_ratio: float = 0.96

The ratio of files that will be training data (rest will be validation data). For bigger datasets, this ratio can be set closer to 1.

arr_from_img(im, mean=0, std=1)

Convert image to array.

Parameters
  • im() – Image.

  • mean (float) – Mean to subtract.

  • std (float) – Standard Deviation to subtract.

Returns

Image in np.float32 format, in width height channel format. With values in range 0,1 Shift means subtract by certain value. Could be used for mean subtraction.

generate_moving_mnist(d_path, training, shape, num_frames, num_images, digit_size, digits_per_image)

Generate sequences of moving MNIST digits by moving them around between frames.

Parameters
  • training (bool) – Used to decide if downloading/generating training set or test set.

  • shape ((int, int)) – Shape we want for our moving images (new_width and new_height).

  • num_frames (int) – Number of frames in a particular movement/animation/gif.

  • num_images (int) – Number of movement/animations/gif to generate.

  • digit_size (int) – Real size of the images (eg: MNIST is 28x28).

  • digits_per_image (int) – Digits per movement/animation/gif.

Returns

Dataset of np.uint8 type with dimensions num_frames * num_images x 1 x new_width x new_height

img_from_arr(arr, index, mean=0, std=1)

Convert array to image.

Parameters
  • arr (np.ndarray) – Dataset of shape N x C x W x H.

  • index (int) – Index of image we want to fetch.

  • mean (float) – Mean to add.

  • std (float) – Standard Deviation to add.

Returns

Image with dimensions H x W x C or H x W if it’s a single channel image.

load_dataset(d_path, training, digit_size)

Loads MNIST from the web on demand.

Parameters
  • d_path (Path) – The path where the downloaded digits should be stored.

  • training (bool) – Whether to use the training images (True) or the test images (False).

  • digit_size (int) – Size of the digit in pixels (height and width are the same)

Returns: The loaded MNIST images.

save_generated_mmnist(data, seqs, frame_size, out_path)

Save generated data per-sequence to specified out path.

Parameters
  • data (np.ndarray) – The generated data to save.

  • seqs (int) – The number of generated sequences.

  • frame_size ((int, int)) – The frame size.

  • out_path (Path) – The path where the data should be saved.