micromind.utils package

Submodules

micromind.utils.checkpointer module

micromind checkpointer. Unwraps models and saves the to disk with optimizer’s state etc.

Authors:

Francesco Paissan, 2023

class micromind.utils.checkpointer.Checkpointer(experiment_folder: str | Path, key: str | None = 'loss', mode: str | None = 'min', hparams: Namespace | None = None)[source]

Bases: object

Checkpointer class. Supports min/max modes for arbitrary keys (Metrics or loss). Always saves best and last in the experiment folder.

Parameters:

experiment_folder (Union[str, Path]) – Experiment folder. Used to load / store checkpoints.
key (Optional[str]) – Key to be logged. It should be the name of the Metric, or “loss”. Defaults to “loss”.
mode (Optional[str]) – Either min or max. If min, will store the checkpoint with the lowest value for key. If max, it does the opposite.

Example

>>> from micromind.utils.checkpointer import Checkpointer
>>> from micromind.utils.checkpointer import create_experiment_folder
>>> exp_folder = create_experiment_folder("/tmp", "test_mm")
>>> check = Checkpointer(exp_folder)

static dump_modules(modules, out_folder)[source]: Dumps state dict for all elements in the modules.

static dump_status(status, out_dir)[source]: Dumps the status of the training.

recover_state()[source]

Recovers last corrected state of the training. If found, returns the accelerate dump folder (for recovery) and the last epoch logged.

Returns:: Checkpoint path and last epoch logged.
Return type:: Tuple[str, int]

micromind.utils.checkpointer.create_experiment_folder(output_folder: Path | str, exp_name: Path | str) → Path[source]

Creates the experiment folder used to log data.

Parameters:

output_folder (Union[Path, str]) – General output folder (can be shared between more experiments).
exp_name (Union[Path, str]) – Name of the experiment, to be concatenated to the output_folder.

Returns:

Experiment folder

Return type:

Union[Path, str]

micromind.utils.helpers module

micromind helper functions.

Authors:

Francesco Paissan, 2023

micromind.utils.helpers.get_logger()[source]: Default loguru logger config. It is called inside micromind’s files.

micromind.utils.helpers.override_conf(hparams: Dict)[source]

Handles command line overrides. Takes as input a configuration and defines all the keys as arguments. If passed from command line, these arguments override the default configuration.

Parameters:: hparams (Dict) – Dictionary containing current configuration.
Returns:: Configuration agumented with overrides.
Return type:: Namespace

micromind.utils.helpers.parse_configuration(cfg: str | Path)[source]

Parses default configuration and compares it with user defined. It processes a user-defined python file that creates the configuration. Additionally, it handles eventual overrides from command line.

Parameters:: cfg (Union[str, Path]) – Configuration file defined by the user
Returns:: Configuration Namespace.
Return type:: argparse.Namespace

micromind.utils.yolo module

Helper functions.

Authors:

Matteo Beltrami, 2023
Francesco Paissan, 2023

micromind.utils.yolo.autopad(k, p=None, d=1)[source]

Calculate padding value for a convolution operation based on kernel size and dilation.

This function computes the padding value for a convolution operation to maintain the spatial size of the input tensor.

Parameters:

k (int) – Kernel size for the convolution operation. If a single integer is provided, it’s assumed that all dimensions have the same kernel size.
p (int, optional) – Padding value for the convolution operation. If not provided, it will be calculated to maintain the spatial size of the input tensor.
d (int, optional) – Dilation for the convolution operation. Default is 1.

Returns:

The padding value to maintain the spatial size of the input tensor

Return type:

int

micromind.utils.yolo.average_precision(predictions, ground_truth, class_id, iou_threshold=0.5)[source]

Calculate the average precision (AP) for a specific class in YOLO predictions.

Parameters:

predictions (list) – List of prediction boxes in the format [x1, y1, x2, y2, confidence, class_id].
ground_truth (list) – List of ground truth boxes in the same format.
class_id (int) – The class ID for which to calculate AP.
iou_threshold (float) – The IoU threshold for considering a prediction as correct.

Returns:

The average precision for the specified class.

Return type:

float

micromind.utils.yolo.bbox_format(box)[source]

Convert a tensor of coordinates [x1, y1, x2, y2] representing two points defining a rectangle to the format [x_min, y_min, x_max, y_max], where x_min, y_min represent the top-left corner, and x_max, y_max represent the bottom-right corner of the rectangle.

Parameters:: box (torch.Tensor) – A tensor of coordinates in the format [x1, y1, x2, y2] where x1, y1, x2, y2 represent the coordinates of two points defining a rectangle.
Returns:: The coordinates in the format [x_min, y_min, x_max, y_max] where x_min, y_min represent the top-left vertex, and x_max, y_max represent the bottom-right vertex of the rectangle.
Return type:: torch.Tensor

micromind.utils.yolo.box_area(box)[source]

Calculate the area of bounding boxes.

This function calculates the area of bounding boxes represented as [x1, y1, x2, y2].

Parameters:: box (torch.Tensor) – A tensor containing bounding boxes in the format [x1, y1, x2, y2].
Returns:: A tensor containing the area of each bounding box
Return type:: torch.Tensor

micromind.utils.yolo.box_iou(box1, box2)[source]

Calculate the Intersection over Union (IoU) between two sets of bounding boxes.

This function computes the IoU between two sets of bounding boxes.

Parameters:

box1 (numpy.ndarray) – The first set of bounding boxes in the format [x1, y1, x2, y2].
box2 (numpy.ndarray) – The second set of bounding boxes in the format [x1, y1, x2, y2].

Returns:

A 2D numpy array containing the IoU between each pair of bounding boxes in box1 and box2.

Return type:

numpy.ndarray

micromind.utils.yolo.calculate_iou(box1, box2)[source]

Calculate the Intersection over Union (IoU) between two bounding boxes.

Parameters:

box1 (torch.Tensor) – First bounding box in the format [x1, y1, x2, y2].
box2 (torch.Tensor) – Second bounding box in the format [x1, y1, x2, y2].

Returns:

The intersection over union of the two bounding boxes.

Return type:

float

micromind.utils.yolo.clip_boxes(boxes, shape)[source]

Clip bounding boxes to stay within image boundaries.

This function clips bounding boxes to ensure that they stay within the boundaries of the image.

Parameters:

boxes (torch.Tensor) – A tensor containing bounding boxes in the format [x1, y1, x2, y2].
shape (tuple) – A tuple representing the shape of the image in the format (height, width).

Returns:

A tensor containing the clipped bounding boxes

Return type:

torch.Tensor

micromind.utils.yolo.compute_transform(image, new_shape=(640, 640), auto=False, scaleFill=False, scaleup=True, stride=32)[source]

Compute a transformation of an image to the specified size and format.

This function computes a transformation of the input image to the specified new size and format, while optionally maintaining the aspect ratio or adding padding as needed.

Parameters:

image (torch.Tensor) – The input image to be transformed.
new_shape (int or tuple, optional) – The target size of the transformed image. If an integer is provided, the image is resized to have the same width and height. If a tuple of two integers is provided, it represents the new width and height. Default is (640, 640).
auto (bool, optional) – If True, automatically calculates padding to ensure the output size is divisible by the specified stride. Default is False.
scaleFill (bool, optional) – If True, scales the image to completely fill the target size without maintaining the aspect ratio. Default is False.
scaleup (bool, optional) – If True, allows the image to be scaled up (enlarged) if necessary. Default is True.
stride (int, optional) – The stride value used for padding calculation when auto is True. Default is 32.

Returns:

The transformed image

Return type:

numpy.ndarray

micromind.utils.yolo.dist2bbox(distance, anchor_points, xywh=True, dim=-1)[source]

Convert distance predictions to bounding box coordinates.

This function takes distance predictions and anchor points to calculate bounding box coordinates.

Parameters:

distance (torch.Tensor) – Tensor containing distance predictions. It should be in the format [lt, rb] if xywh is True, or [x1y1, x2y2] if xywh is False.
anchor_points (torch.Tensor) – Tensor containing anchor points used for the conversion.
xywh (bool, optional) – If True, the function returns bounding boxes in the format [center_x, center_y, width, height]. If False, it returns bounding boxes in the format [x1, y1, x2, y2]. Default is True.
dim (int, optional) – The dimension along which the tensor is split into lt and rb. Default is -1.

Returns:

Converted bounding box coordinates in the specified format

Return type:

torch.Tensors

micromind.utils.yolo.draw_bounding_boxes_and_save(orig_img_paths, output_img_paths, all_predictions, class_labels, iou_threshold=0.5)[source]

Draw bounding boxes on images based on object detection predictions and save the result.

This function draws bounding boxes on images based on object detection predictions and saves the result. It also prints the number of objects detected for each class.

Parameters:

orig_img_paths (list of str) – A list of file paths to the original input images.
output_img_paths (list of str) – A list of file paths to save the images with bounding boxes.
all_predictions (list of list of numpy.ndarray) – A list of lists of prediction arrays from the object detection model.
class_labels (list of str) – A list of class labels corresponding to the object classes.
iou_threshold (float, optional) – The IoU threshold used for non-maximum suppression to remove overlapping bounding boxes. Default is 0.5.

Return type:

None

micromind.utils.yolo.get_variant_multiples(variant)[source]

micromind.utils.yolo.load_config(file_path)[source]

Load configuration from a YAML file and preprocess it for training.

Parameters:

file_path (str) – Path to the YAML configuration file.

Returns:

m_cfg (types.SimpleNamespace) – Model configuration containing task-specific parameters.
data_cfg (dict) – Data configuration containing paths and settings for train, val and test.

micromind.utils.yolo.make_anchors(feats, strides, grid_cell_offset=0.5)[source]

Generate anchor points and stride tensors.

This function generates anchor points for each feature map and stride combination. It is commonly used in object detection tasks to define anchor boxes.

Parameters:

feats (torch.Tensor) – A feature map (tensor) from which anchor points will be generated.
strides (torch.Tensor) – Stride values corresponding to each feature map. Strides define the spacing between anchor points.
grid_cell_offset (float, optional) – Offset to be added to the grid cell coordinates when generating anchor points. Default is 0.5.

Returns:

anchor_points (torch.Tensor) – Concatenated anchor points for all feature maps as a 2D tensor.
stride_tensor (torch.Tensor) – Concatenated stride values for all anchor points as a 2D tensor.

micromind.utils.yolo.mean_average_precision(post_predictions, batch, batch_bboxes, iou_threshold=0.5)[source]

Calculate the mean average precision (mAP) for all classes in YOLO predictions.

Parameters:

post_predictions (list) – List of post-processed predictions for bounding boxes.
batch (dict) – A dictionary containing batch information, including image files, batch indices.
batch_bboxes (torch.Tensor) – Tensor containing batch bounding boxes.
iou_threshold (float) – The IoU threshold for considering a prediction as correct.

Returns:

The mean average precision (mAP).

Return type:

float

micromind.utils.yolo.non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False, multi_label=False, labels=(), max_det=300, nc=0, max_time_img=0.05, max_nms=30000, max_wh=7680)[source]

Perform non-maximum suppression (NMS) on a set of boxes, with support for masks and multiple labels per box.

Parameters:

prediction (torch.Tensor) – A tensor of shape (batch_size, num_classes + 4 + num_masks, num_boxes) containing the predicted boxes, classes, and masks. The tensor should be in the format output by a model, such as YOLO.
conf_thres (float, optional) – The confidence threshold below which boxes will be filtered out. Valid values are between 0.0 and 1.0. Default is 0.25.
iou_thres (float, optional) – The IoU threshold below which boxes will be filtered out during NMS. Valid values are between 0.0 and 1.0. Default is 0.45.
classes (List[int], optional) – A list of class indices to consider. If None, all classes will be considered.
agnostic (bool, optional) – If True, the model is agnostic to the number of classes, and all classes will be considered as one. Default is False.
multi_label (bool, optional) – If True, each box may have multiple labels. Default is False.
labels (List[List[Union[int, float, torch.Tensor]]], optional) – A list of lists, where each inner list contains the apriori labels for a given image. The list should be in the format output by a dataloader, with each label being a tuple of (class_index, x1, y1, x2, y2).
max_det (int, optional) – The maximum number of boxes to keep after NMS. Default is 300.
nc (int, optional) – The number of classes output by the model. Any indices after this will be considered masks. Default is 0.
max_time_img (float, optional) – The maximum time (seconds) for processing one image. Default is 0.05.
max_nms (int, optional) – The maximum number of boxes into torchvision.ops.nms(). Default is 30000.
max_wh (int, optional) – The maximum box width and height in pixels. Default is 7680.

Returns:

A list of length batch_size, where each element is a tensor of shape (num_boxes, 6 + num_masks) containing the kept boxes, with columns (x1, y1, x2, y2, confidence, class, mask1, mask2, …).

Return type:

List[torch.Tensor]

micromind.utils.yolo.postprocess(preds, img, orig_imgs)[source]

Perform post-processing on the predictions.

This function applies post-processing to the predictions, including Non-Maximum Suppression (NMS) and scaling of bounding boxes.

Parameters:

preds (list of numpy.ndarray) – A list of prediction arrays from the object detection model.
img (numpy.ndarray) – The input image on which the predictions were made.
orig_imgs (numpy.ndarray or list of numpy.ndarray) – The original image(s) before any preprocessing.

Returns:

A list of post-processed prediction arrays, each containing bounding boxes and associated information.

Return type:

list of numpy.ndarray

micromind.utils.yolo.preprocess(im, imgsz=640, model_stride=32, model_pt=True)[source]

Preprocess a batch of images for inference.

This function preprocesses a batch of images for inference by resizing, transforming, and normalizing them.

Parameters:

im (torch.Tensor or list of torch.Tensor) – An input image or a batch of images to be preprocessed.
imgsz (int, optional) – The target size of the images after preprocessing. Default is 640.
model_stride (int, optional) – The stride value used for padding calculation when auto is True in compute_transform. Default is 32.
model_pt (bool, optional) – If True, the function automatically calculates the padding to maintain the same shapes for all input images in the batch. Default is True.

Returns:

The preprocessed batch of images as a torch.Tensor with shape (n, 3, h, w), where n is the number of images, 3 represents the RGB channels, and h and w are the height and width of the images.

Return type:

torch.Tensor

micromind.utils.yolo.scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None)[source]

Scale bounding boxes to match a different image shape.

This function scales bounding boxes to match a different image shape while maintaining their aspect ratio.

Parameters:

img1_shape (tuple) – A tuple representing the shape of the target image in the format (height, width).
boxes (torch.Tensor) – A tensor containing bounding boxes in the format [x1, y1, x2, y2].
img0_shape (tuple) – A tuple representing the shape of the source image in the format (height, width).
ratio_pad (float or None, optional) – A scaling factor for the bounding boxes. If None, it is calculated based on the aspect ratio of the images. Default is None.

Returns:

A tensor containing the scaled bounding boxes

Return type:

torch.Tensor

micromind.utils.yolo.xywh2xyxy(x)[source]

Convert bounding box coordinates from (x, y, width, height) to (x1, y1, x2, y2) format.

This function converts bounding box coordinates from the format (center_x, center_y, width, height) to the format (x1, y1, x2, y2), where (x1, y1) represents the top-left corner and (x2, y2) represents the bottom-right corner of the bounding box.

Parameters:: x (torch.Tensor) – A tensor containing bounding box coordinates in the format (center_x, center_y, width, height).
Returns:: A tensor containing bounding box coordinates in the format (x1, y1, x2, y2).
Return type:: torch.Tensor