API Reference¶
mydia.Videos¶
An instance of this class is used as a reader to read videos.
- 
class mydia.Videos(target_size=None, to_gray=False, num_frames=None, mode='auto', normalize=False, data_format='channels_last', random_state=17)[source]¶
- Class to read in videos and store them as numpy arrays - The videos are stored as a 5-dimensional tensor where the shape of the tensor depends on - data_format.- Parameters: - target_size (tuple[int, int]) – A tuple of form (width, height)indicating the dimension to resize the frames of the video, defaults to None. The dimension of the frames will not be altered if this parameter is not set.
- to_gray (bool) – Convert video to grayscale, defaults to False.
- num_frames (int) – The (exact) number of frames to extract from the video, defaults
to None. Frames are extracted based on the value of mode. If not set, all the frames of the video are kept.
- mode (str) – The method used for frame extraction if num_framesis set. It could be one of “auto”, “random”, “first”, “last” or “middle”.- "auto": N frames will be extracted at equal intervals.
- "random": N frames will be randomly extracted (no repetetion). Use- random_stateto ensure reproducibility.
- "first",- "last"and- "middle"will extract N contiguous frames from the beginning, end and middle of the video respectively.
 
- normalize (bool) – Shifts each video to the range (0, 1) by subtracting the minimum and dividing by the difference between the maximum and the minimum pixel value. Defaults to False
- data_format (str) – Video data format, either “channels_last” or “channels_first”. - "channels_last": The tensor will have shape- (<videos>, <frames>, <height>, <width>, <channels>)
- "channels_first": The tensor will have shape- (<videos>, <channels>, <frames>, <height>, <width>)
 channelswill be 3 for videos in RGB format, or 1 for videos in grayscale.
- random_state (int) – Integer that seeds the (numpy) random number generator, defaults
to 17. Used only when modeis set to “random”.
 - Example - from mydia import Videos reader = Videos( target_size=(720, 480), to_gray=False, num_frames=128, data_format="channels_first" ) video = reader.read("./path/to/video") - Note - You could also pass a callable to - modefor custom frame extraction. The callable should return a list of integers, denoting the indices of the frames to be extracted. It should take 4 (non-keyword) arguments:- total_frames: The total number of frames in the video
- num_frames: The number of frames that you want to extract
- fps: The frame rate of the video
- random_state: Integer to seed the random number generator
 - These arguments may/may not be used to generate the required frame indices. Detailed examples are provided in the documentation. - Warning - If you are passing a callable to - mode, then make sure that the number of frames (indices) it returns is equal to the value of- num_frames. If this condition is not met, then this would mean that the number of frames selected is different for different videos, and therefore they cannot be stacked into a single tensor.- 
read(paths, verbose=1, workers=0)[source]¶
- Function to read videos - Parameters: - paths (str or list[str]) – A list of paths/path of the video(s) to be read.
- verbose (int) – If set to 0, the progress bar will be disabled.
- workers (int) – The number of processes (CPUs) to use for reading the videos. This uses the multiprocessingmodule present in the python standard library.Its value can range from 0 to max_workers where the latter can be determined by calling multiprocessing.cpu_count()on your machine.Defaults to 0, which means that multiprocessing will not be used. 
 - Returns: - A 5-dimensional tensor, whose shape will depend on the value of - data_format.- For "channels_last": The tensor will have shape(<videos>, <frames>, <height>, <width>, <channels>)
- For "channels_first": The tensor will have shape(<videos>, <channels>, <frames>, <height>, <width>)
 - Return type: - Raises: - ValueError– If- pathsis neither a string, not a list of strings.
- IndexError– If- num_framesis set to a value greater than the total number of frames available in the video.
 - Important - If multiple videos are to be read, then each video should have the same dimension - (frames, height, width), otherwise they cannot be stacked into a single tensor. Therefore, the user must use the parameters- target_sizeand- num_framesto make sure of this.
 
- target_size (tuple[int, int]) – A tuple of form 
mydia.make_grid¶
This method can be used for converting a video into a grid of frames. Inspired from a similar utility provided in torchvision
- 
mydia.make_grid(video, num_col=3, padding=5)[source]¶
- Converts a video into a grid of frames. - Parameters: - video (numpy.ndarray) – A 4-dimensional video tensor (a single video).
- num_col (int) – The number of columns in the grid, defaults to 3.
- padding (int) – Amount of padding (in pixels), defaults to 5.
 - Returns: - A gird of frames (numpy array) of shape - (height, width, 3)if the video is in RGB format, or- (height, width)if the video is in grayscale.- Return type: - Raises: - ValueError– If the dimension of the- videotensor is invalid.- Example - import matplotlib.pyplot as plt from mydia import Videos, make_grid reader = Videos(target_size=(720, 480), to_gray=True) video = reader.read("./path/to/video") grid = make_grid(video[0], num_col=6, padding=8) plt.imshow(grid, cmap="gray") - Note - The input to this function should be a single video tensor, with any - data_format. However, the grid of frames produced as the output will always be- "channels_last".
- video (
