dlt.util¶
Functionals¶
compose¶
-
dlt.util.
compose
(transforms)¶ Composes list of transforms (each accept and return one item).
(From PyTorchNet)
Parameters: transforms (list) – List of callables, each accepts and returns one item. Returns: The composed transforms. Return type: callable
applier¶
-
dlt.util.
applier
(f)¶ Returns a function that applies f to a collection of inputs (or just one).
Useful to use in conjuction with
dlt.util.compose()
Parameters: f (function) – Function to be applied. Returns: A function that applies ‘f’ to collections Return type: callable Example
>>> pow2 = dlt.util.applier(lambda x: x**2) >>> pow2(42) 1764 >>> pow2([1, 2, 3]) [1, 4, 9] >>> pow2({'a': 1, 'b': 2, 'c': 3}) {'a': 1, 'b': 4, 'c': 9}
Quality of Life¶
Logger¶
-
class
dlt.util.
Logger
(name, fields, directory='.', delimiter=', ', resume=True)¶ Logs values in a csv file.
Parameters: - name (str) – Filename without extension.
- fields (list or tuple) – Field names (column headers).
- directory (str, optional) – Directory to save file (default ‘.’).
- delimiter (str, optional) – Delimiter for values (default ‘,’).
- resume (bool, optional) – If True it appends to an already existing file (default True).
-
log
(values)¶ Logs a row of values.
Parameters: values (dict) – Dictionary containing the names and values.
Checkpointer¶
-
class
dlt.util.
Checkpointer
(name, directory='.', overwrite=True, verbose=True, timestamp=False, add_count=True)¶ Checkpointer for objects using torch serialization.
Parameters: - name (str) – Name of the checkpointer. This will also be used for the checkpoint filename.
- directory (str, optional) – Parent directory of where the checkpoints will happen. A new sub-directory called checkpoints will be created (default ‘.’).
- overwrite (bool, optional) – Overwrite/remove the previous checkpoint (default True).
- verbose (bool, optional) – Print a statement when loading a checkpoint (default True).
- timestamp (bool, optional) – Add a timestamp to checkpoint filenames (default False).
- add_count (bool, optional) – Add (zero-padded) counter to checkpoint filenames (default True).
Example
>>> a = {'data': 5} >>> a_chkp = dlt.util.Checkpointer('a_saved') >>> a_chkp.save(a) >>> b = a_chkp.load() {'data': 5}
It automatically saves and loads the state_dict of objects:
>>> net = nn.Sequential(nn.Linear(1,1), nn.Sigmoid()) >>> net_chkp = dlt.util.Checkpointer('net_state_dict') >>> net_chkp.save(net) >>> net_chkp.load(net) Sequential( (0): Linear(in_features=1, out_features=1) (1): Sigmoid() ) >>> state_dict = net_chkp.load() OrderedDict([('0.weight', -0.2286 [torch.FloatTensor of size 1x1] ), ('0.bias', 0.8495 [torch.FloatTensor of size 1] )])
Warning
If a model is wrapped in nn.DataParallel then the wrapped model.module (state_dict) is saved. Thus applying nn.DataParallel must be done after using Checkpointer.load().
-
load
(obj=None, preprocess=None, *args, **kwargs)¶ Loads a checkpoint from disk.
Parameters: - obj (optional) – Needed if we load the state_dict of an nn.Module.
- preprocess (optional) – Callable to preprocess the loaded object.
- args – Arguments to pass to torch.load.
- kwargs – Keyword arguments to pass to torch.load.
Returns: The loaded file.
-
save
(obj, tag=None, *args, **kwargs)¶ Saves a checkpoint of an object.
Parameters: - obj – Object to save (must be serializable by torch).
- tag (str, optional) – Tag to add to saved filename (default None).
- args – Arguments to pass to torch.save.
- kwargs – Keyword arguments to pass to torch.save.
ImageSampler¶
-
class
dlt.util.
ImageSampler
(name, directory='.', overwrite=False, view='torch', ext='.jpg', color=True, size=None, inter_pad=None, fill_value=0, preprocess=None, display=False, save=True, sample_freq=1)¶ Saves and/or displays image samples.
Parameters: - name (str) – Name of the checkpointer. This will also be used for the checkpoint filename.
- directory (str, optional) – Parent directory of where the samples will be saved. A new sub-directory called samples will be created (default ‘.’).
- overwrite (bool, optional) – Overwrite/remove the previous checkpoint (default False).
- view (str, optional) – The image view e.g. ‘hwc-bgr’ or ‘torch’ (default ‘torch’).
- ext (str, optional) – The image format for the saved samples (default ‘.jpg’).
- color (bool, optional) – Treat images as colored or not (default True).
- size (list or tuple, optional) – Grid dimensions, rows x columns. (default None).
- inter_pad (python:int, optional) – Padding separating the images (default None).
- fill_value (python:int, optional) – Fill value for inter-padding (default 0).
- preprocess (callable, optional) – Pre processing to apply to the image samples (default None).
- display (bool, optional) – Display images (default False).
- save (bool, optional) – Save images to disk (default True).
- sample_freq (python:int, optional) – Frequency of samples (per sampler call) (default 1).
-
sample
(imgs)¶ Saves and/or displays functions depending on the configuration.
Parameters: imgs (Tensor, Array, list or tuple) – Image samples. Will automatically be put in a grid. Must be in the [0,1] range.
Averages¶
-
class
dlt.util.
Averages
(names)¶ Keeps multiple named averages.
Parameters: names (collection) – Collection of strings to be used as names for the averages. -
add
(values, count=1)¶ Adds new values
Parameters: - values (dict or list) – Collection of values to be added. Could be given as a dict or a list. Order is preserved.
- count (python:int, optional) – Number of summed values that make the total values given. Can be used to register multiple (summed) values at once (default 1).
-
get
(names=None, ret_dict=False)¶ Returns the current averages
Parameters: - names (str or list, optional) – Names of averages to be returned.
- ret_dict (bool, optional) – If true return the results in a dictionary, otherwise a list.
Returns: The averages.
Return type: dict or list
-
names
()¶ Returns the names of the values held.
-
reset
(names=None)¶ Resets averages to 0.
Parameters: names (collection, optional) – Collection of the names to be reset. If None is given, all the values are reset (default None).
-
barit¶
-
dlt.util.
barit
(iterable, start=None, end=None, time_it=True, length=20, leave=True, filler='=')¶ Minimal progress bar for iterables.
Parameters: - iterable (list or tuple etc) – An iterable.
- start (str, optional) – String to place infront of the progress bar (default None).
- end (str, optional) – String to add at the end of the progress bar (default None).
- timeit (bool, optional) – Print elapsed time and ETA (default True).
- length (python:int, optional) – Length of the progress bar not including ends (default 20).
- leave (bool, optional) – If False, it deletes the progress bar once it ends (default True).
- filler (str, optional) – Filler character for the progress bar (default ‘=’).
Example
>>> for _ in dlt.util.barit(range(100), start='Count'): >>> time.sleep(0.02) Count: [====================] 100.0%, (10/10), Total: 0.2s, ETA: 0.0s
Note
barit can be put to silent mode using:
>>> dlt.util.silent = True
For a progress bar with more functionality have a look at tqdm.
make_grid¶
-
dlt.util.
make_grid
(images, view='torch', color=True, size=None, inter_pad=None, fill_value=0, scale_each=False)¶ Creates a single image grid from a set of images.
Parameters: - images (Tensor, Array, list or tuple) – Torch Tensor(s) and/or Numpy Array(s).
- view (str, optional) – The image view e.g. ‘hwc-bgr’ or ‘torch’ (default ‘torch’).
- color (bool, optional) – Treat images as colored or not (default True).
- size (list or tuple, optional) – Grid dimensions, rows x columns. (default None).
- inter_pad (python:int or list/tuple, optional) – Padding separating the images (default None).
- fill_value (python:int, optional) – Fill value for inter-padding (default 0).
- scale_each (bool, optional) – Scale each image to [0-1] (default False).
Returns: The resulting grid. If any of the inputs is an Array then the result is an Array, otherwise a Tensor.
Return type: Tensor or Array
Notes
Images of different sizes are padded to match the largest.
Works for color (3 channels) or grey (1 channel/0 channel) images.
Images must have the same view (e.g. chw-rgb (torch))
The Tensors/Arrays can be of any dimension >= 2. The last 2 (grey) or last 3 (color) dimensions are the images and all other dimensions are stacked. E.g. a 4x5x3x256x256 (torch view) input will be treated:
- As 20 3x256x256 color images if color is True.
- As 60 256x256 grey images if color is False.
If color is False, then only the last two channels are considered (as hw) thus any colored images will be split into their channels.
The image list can contain both Torch Tensors and Numpy Arrays. at the same time as long as they have the same view.
If size is not given, the resulting grid will be the smallest square in which all the images fit. If the images are more than the given size then the default smallest square is used.
Raises: TypeError
– If images are not Arrays, Tensors, a list or a tupleValueError
– If channels or dimensions are wrong.
accuracy¶
-
dlt.util.
accuracy
(output, target, topk=1)¶ Computes the precision@k for the specified values of k.
(From ImageNet example)
- Args:
- output (Tensor): The class labels. target (Tensor): The predictions from the model. topk (int or collection): The specified values of k.
- Returns:
- list: The top-k accuracy values.
Tensor/Array Operations¶
slide_window_¶
-
dlt.util.
slide_window_
(a, kernel, stride=None)¶ Expands last dimension to help compute sliding windows.
Parameters: - a (Tensor or Array) – The Tensor or Array to view as a sliding window.
- kernel (python:int) – The size of the sliding window.
- stride (tuple or python:int, optional) – Strides for viewing the expanded dimension (default 1)
The new dimension is added at the end of the Tensor or Array.
Returns: The expanded Tensor or Array. Running Sum Example:
>>> a = torch.Tensor([1, 2, 3, 4, 5, 6]) 1 2 3 4 5 6 [torch.FloatTensor of size 6] >>> a_slided = dlt.util.slide_window_(a.clone(), kernel=3, stride=1) 1 2 3 2 3 4 3 4 5 4 5 6 [torch.FloatTensor of size 4x3] >>> running_total = (a_slided*torch.Tensor([1,1,1])).sum(-1) 6 9 12 15 [torch.FloatTensor of size 4]
Averaging Example:
>>> a = torch.Tensor([1, 2, 3, 4, 5, 6]) 1 2 3 4 5 6 [torch.FloatTensor of size 6] >>> a_sub_slide = dlt.util.slide_window_(a.clone(), kernel=3, stride=3) 1 2 3 4 5 6 [torch.FloatTensor of size 2x3] >>> a_sub_avg = (a_sub_slide*torch.Tensor([1,1,1])).sum(-1) / 3.0 2 5 [torch.FloatTensor of size 2]
re_stride¶
-
dlt.util.
re_stride
(a, kernel, stride=None)¶ Returns a re-shaped and re-strided Rensor given a kernel (uses as_strided).
Parameters: - a (Tensor) – The Tensor to re-stride.
- kernel (tuple or python:int) – The size of the new dimension(s).
- stride (tuple or python:int, optional) – Strides for viewing the expanded dimension(s) (default 1)
replicate¶
-
dlt.util.
replicate
(x, dim=-3, nrep=3)¶ Replicates Tensor/Array in a new dimension.
Parameters: - x (Tensor or Array) – Tensor to replicate.
- dim (python:int, optional) – New dimension where replication happens.
- nrep (python:int, optional) – Number of replications.
Image Conversions¶
permute¶
-
dlt.util.
permute
(x, perm)¶ Permutes the last three dimensions of the input Tensor or Array.
Parameters: - x (Tensor or Array) – Input to be permuted.
- perm (tuple or list) – Permutation.
Note
If the input has less than three dimensions a copy is returned.
hwc2chw¶
-
dlt.util.
hwc2chw
(x)¶ Permutes the last three dimensions of the hwc input to become chw.
Parameters: x (Tensor or Array) – Input to be permuted.
chw2hwc¶
-
dlt.util.
chw2hwc
(x)¶ Permutes the last three dimensions of the chw input to become hwc.
Parameters: x (Tensor or Array) – Input to be permuted.
channel_flip¶
-
dlt.util.
channel_flip
(x, dim=-3)¶ Reverses the channel dimension.
Parameters: - x (Tensor or Array) – Input to have its channels flipped.
- dim (python:int, optional) – Channels dimension (default -3).
Note
If the input has less than three dimensions a copy is returned.
rgb2bgr¶
-
dlt.util.
rgb2bgr
(x, dim=-3)¶ Reverses the channel dimension. See
channel_flip()
bgr2rgb¶
-
dlt.util.
bgr2rgb
(x, dim=-3)¶ Reverses the channel dimension. See
channel_flip()
change_view¶
-
dlt.util.
change_view
(x, current, new)¶ Changes the view of the input. Returns a copy.
Parameters: - x (Tensor or Array) – Input whose view is to be changed.
- current (str) – Current view.
- new (str) – New view.
Possible views:
View Aliases opencv hwcbgr, hwc-bgr, bgrhwc, bgr-hwc, opencv, open-cv, cv, cv2 torch chwrgb, chw-rgb, rgbchw, rgb-chw, torch, pytorch plt hwcrgb, hwc-rgb, rgbhwc, rgb-hwc, plt, pyplot, matplotlib other chwbgr, chw-bgr, bgrchw, bgr-chw Note
If the input has less than three dimensions a copy is returned.
cv2torch¶
-
dlt.util.
cv2torch
(x)¶ Converts input to Tensor and changes view from cv (hwc-bgr) to torch (chw-rgb).
For more detail see
change_view()
torch2cv¶
-
dlt.util.
torch2cv
(x)¶ Converts input to Array and changes view from torch (chw-rgb) to cv (hwc-bgr).
For more detail see
change_view()
cv2plt¶
-
dlt.util.
cv2plt
(x)¶ Changes view from cv (hwc-bgr) to plt (hwc-rgb).
For more detail see
change_view()
plt2cv¶
-
dlt.util.
plt2cv
(x)¶ Changes view from plt (hwc-rgb) to cv (hwc-bgr).
For more detail see
change_view()
plt2torch¶
-
dlt.util.
plt2torch
(x)¶ Converts input to Tensor and changes view from plt (hwc-rgb) to torch (chw-rgb).
For more detail see
change_view()
torch2plt¶
-
dlt.util.
torch2plt
(x)¶ Converts input to Array and changes view from torch (chw-rgb) to plt (hwc-rgb) .
For more detail see
change_view()
Math¶
moving_avg¶
-
dlt.util.
moving_avg
(x, width=5)¶ Performes moving average of a one dimensional Tensor or Array
Parameters: - x (Tensor or Array) – 1D Tensor or array.
- width (python:int, optional) – Width of the kernel.
moving_var¶
-
dlt.util.
moving_var
(x, width=5)¶ Performes moving variance of a one dimensional Tensor or Array
Parameters: - x (Tensor or Array) – 1D Tensor or array.
- width (python:int, optional) – Width of the kernel.
sub_avg¶
-
dlt.util.
sub_avg
(x, width=5)¶ Performes averaging of a one dimensional Tensor or Array every width elements.
Parameters: - x (Tensor or Array) – 1D Tensor or array.
- width (python:int, optional) – Width of the kernel.
sub_var¶
-
dlt.util.
sub_var
(x, width=5)¶ Calculates variance of a one dimensional Tensor or Array every width elements.
Parameters: - x (Tensor or Array) – 1D Tensor or array.
- width (python:int, optional) – Width of the kernel.
has¶
replace_nan_¶
-
dlt.util.
replace_nan_
(x, val=0)¶ Replaces NaNs from a Numpy Array.
Parameters: - x (Array) – The Array (gets replaced in place).
- val (python:int, optional) – Value to replace Infs with (default 0).
replace_inf_¶
-
dlt.util.
replace_inf_
(x, val=0)¶ Replaces Infs from a Numpy Array.
Parameters: - x (Array) – The Array (gets replaced in place).
- val (python:int, optional) – Value to replace Infs with (default 0).
Convolution Layer Math¶
out_size¶
-
dlt.util.
out_size
(dim_in, k, s, p, d)¶ Calculates the resulting size after a convolutional layer.
Parameters: - dim_in (python:int) – Input dimension size.
- k (python:int) – Kernel size.
- s (python:int) – Stride of convolution.
- p (python:int) – Padding (of input).
- d (python:int) – Dilation
in_size¶
-
dlt.util.
in_size
(dim_out, k, s, p, d)¶ Calculates the input size before a convolutional layer.
Parameters: - dim_out (python:int) – Output dimension size.
- k (python:int) – Kernel size.
- s (python:int) – Stride of convolution.
- p (python:int) – Padding (of input).
- d (python:int) – Dilation
kernel_size¶
-
dlt.util.
kernel_size
(dim_in, dim_out, s, p, d)¶ Calculates the possible kernel size(s) of a convolutional layer given input and output.
Parameters: - dim_in (python:int) – Input dimension size.
- dim_out (python:int) – Output dimension size.
- s (python:int) – Stride of convolution.
- p (python:int) – Padding (of input).
- d (python:int) – Dilation
stride_size¶
-
dlt.util.
stride_size
(dim_in, dim_out, k, p, d)¶ Calculates the possible stride size(s) of a convolutional layer given input and output.
Parameters: - dim_in (python:int) – Input dimension size.
- dim_out (python:int) – Output dimension size.
- k (python:int) – Kernel size.
- p (python:int) – Padding (of input).
- d (python:int) – Dilation
padding_size¶
-
dlt.util.
padding_size
(dim_in, dim_out, k, s, d)¶ Calculates the possible padding size(s) of a convolutional layer given input and output.
Parameters: - dim_in (python:int) – Input dimension size.
- dim_out (python:int) – Output dimension size.
- k (python:int) – Kernel size.
- s (python:int) – Stride of convolution.
- d (python:int) – Dilation
dilation_size¶
-
dlt.util.
dilation_size
(dim_in, dim_out, k, s, p)¶ Calculates the possible dilation size(s) of a convolutional layer given input and output.
Parameters: - dim_in (python:int) – Input dimension size.
- dim_out (python:int) – Output dimension size.
- k (python:int) – Kernel size.
- s (python:int) – Stride of convolution.
- p (python:int) – Padding (of input).
find_layers¶
-
dlt.util.
find_layers
(dims_in=None, dims_out=None, ks=None, ss=None, ps=None, ds=None)¶ Calculates all the possible convolutional layer size(s) and parameters.
Parameters: - dim_in (list) – Input dimension sizes.
- dim_out (list) – Output dimension sizes.
- k (list) – Kernel sizes.
- s (list) – Strides of convolutions.
- p (list) – Paddings (of inputs).
Datasets¶
LoadedDataset¶
-
class
dlt.util.
LoadedDataset
(dataset, preprocess=None)¶ Create a torch Dataset from data in memory with on the fly pre-processing.
Useful when to use with torch DataLoader.
Parameters: - dataset (sequence or collection) – A sequence or collection of data points that can be indexed.
- preprocess (callable, optional) – A function that takes a single data point from the dataset to preprocess on the fly (default None).
Example
>>> a = [1.0, 2.0, 3.0] >>> a_dataset = dlt.util.LoadedDataset(a, lambda x: x**2) >>> loader = torch.utils.data.DataLoader(a_dataset, batch_size=3) >>> for val in loader: >>> print(val) 1 4 9 [torch.DoubleTensor of size 3]
DirectoryDataset¶
-
class
dlt.util.
DirectoryDataset
(data_root, extensions, load_fn, preprocess=None)¶ Creates a dataset of images (no label) recursively (no structure requirement).
Similar to torchvision.datasets.FolderDataset, however there is no need for a specific directory structure, or data format.
Parameters: - data_root (string) – Path to root directory of data.
- extensions (list or tuple) – Extensions/ending patterns of data files.
- loader (callable) – Function that loads the data files.
- preprocess (callable, optional) – A function that takes a single data point from the dataset to preprocess on the fly (default None).
Sampling¶
index_gauss¶
-
dlt.util.
index_gauss
(img, precision=None, crop_size=None, random_size=True, ratio=None, seed=None)¶ Returns indices (Numpy slice) of an image crop sampled spatially using a gaussian distribution.
Parameters: - img (Array) – Image as a Numpy array (OpenCV view, hwc-BGR).
- precision (list or tuple, optional) – Floats representing the precision of the Gaussians (default [1, 4])
- crop_size (list or tuple, optional) – Ints representing the crop size (default [img_width/4, img_height/4]).
- random_size (bool, optional) – If true, randomizes the crop size with a minimum of crop_size. It uses an exponential distribution such that smaller crops are more likely (default True).
- ratio (python:float, optional) – Keep a constant crop ratio width/height (default None).
- seed (python:float, optional) – Set a seed for np.random.seed() (default None)
Note
If ratio is None then the resulting ratio can be anything.
If random_size is False and ratio is not None, the largest dimension dictated by the ratio is adjusted accordingly:
- crop_size is (w=100, h=10) and ratio = 9 ==> (w=90, h=10)
- crop_size is (w=100, h=10) and ratio = 0.2 ==> (w=100, h=20)
slice_gauss¶
-
dlt.util.
slice_gauss
(img, precision=None, crop_size=None, random_size=True, ratio=None, seed=None)¶ Returns a cropped sample from an image array using
index_gauss()
index_uniform¶
-
dlt.util.
index_uniform
(img, crop_size=None, random_size=True, ratio=None, seed=None)¶ Returns indices (Numpy slice) of an image crop sampled spatially using a uniform distribution.
Parameters: - img (Array) – Image as a Numpy array (OpenCV view, hwc-BGR).
- crop_size (list or tuple, optional) – Ints representing the crop size (default [img_width/4, img_height/4]).
- random_size (bool, optional) – If true, randomizes the crop size with a minimum of crop_size. It uses an exponential distribution such that smaller crops are more likely (default True).
- ratio (python:float, optional) – Keep a constant crop ratio width/height (default None).
- seed (python:float, optional) – Set a seed for np.random.seed() (default None)
Note
If ratio is None then the resulting ratio can be anything.
If random_size is False and ratio is not None, the largest dimension dictated by the ratio is adjusted accordingly:
- crop_size is (w=100, h=10) and ratio = 9 ==> (w=90, h=10)
- crop_size is (w=100, h=10) and ratio = 0.2 ==> (w=100, h=20)
slice_uniform¶
-
dlt.util.
slice_uniform
(img, crop_size=None, random_size=True, ratio=None, seed=None)¶ Returns a cropped sample from an image array using
index_uniform()
HPC¶
slurm¶
-
dlt.util.
slurm
(code, directory='.', name='job', directives=None)¶ Creates a script for the Slurm Scheduler.
Parameters: - code (str) – The code that is to be run from the script
- directory (str, optional) – The directory where the script is created (defult ‘.’).
- name (str, optional) – Script filename (default ‘job’).
- directives (dict) – Set of directives to use (default None).
Available directives:
key Default –job-name job –time 48:00:00 –nodes 1 –ntasks-per-node 1 –mem-per-cpu None –mem None –partition None –gres None –exclude None –nodelist None –output None –mail-type None –mail-user None
Misc¶
str2bool¶
-
dlt.util.
str2bool
(x)¶ Converts a string to boolean type.
If the string is any of [‘no’, ‘false’, ‘f’, ‘0’], or any capitalization, e.g. ‘fAlSe’ then returns False. All other strings are True.
paths¶
-
dlt.util.paths.
split
(directory)¶ Splits a full filename path into its directory path, name and extension
Parameters: directory (str) – Directory to split. Returns: (Directory name, filename, extension) Return type: tuple
-
dlt.util.paths.
make
(directory)¶ Make a new directory
Parameters: directory (str) – Directory to make.
-
dlt.util.paths.
copy_to_dir
(file, directory)¶ Copies a file to a directory
Parameters: - file (str) – File to copy.
- directory (str) – Directory to copy file to.
-
dlt.util.paths.
process
(directory, create=False)¶ Expands home path, finds absolute path and creates directory (if create is True).
Parameters: - directory (str) – Directory to process.
- create (bool, optional) – If True, it creates the directory.
Returns: The processed directory.
Return type: str
-
dlt.util.paths.
write_file
(contents, filename, directory='.', append=False)¶ Writes contents to file.
Parameters: - contents (str) – Contents to write to file.
- filename (str) – File to write contents to.
- directory (str, optional) – Directory to put file in.
- append (bool, optional) – If True and file exists, it appends contents.
Returns: Full path to file.
Return type: str