micromind.networks package

Submodules

micromind.networks.phinet module

Code for PhiNets (https://doi.org/10.1145/3510832).

Authors:

Francesco Paissan, 2023
Alberto Ancilotto, 2023
Matteo Beltrami, 2023
Matteo Tremonti, 2023

class micromind.networks.phinet.DepthwiseConv2d(in_channels, depth_multiplier=1, kernel_size=3, stride=1, padding=0, dilation=1, bias=False, padding_mode='zeros')[source]

Bases: Conv2d

Depthwise 2D convolution layer.

Parameters:

in_channels (int) – Number of input channels.
depth_multiplier (int, optional) – The channel multiplier for the output channels (default is 1).
kernel_size (int or tuple, optional) – Size of the convolution kernel (default is 3).
stride (int or tuple, optional) – Stride of the convolution (default is 1).
padding (int or tuple, optional) – Zero-padding added to both sides of the input (default is 0).
dilation (int or tuple, optional) – Spacing between kernel elements (default is 1).
bias (bool, optional) – If True, adds a learnable bias to the output (default is False).
padding_mode (str, optional) – ‘zeros’ or ‘circular’. Padding mode for convolution (default is ‘zeros’).

class micromind.networks.phinet.PhiNet(input_shape: List[int], num_layers: int = 7, alpha: float = 0.2, beta: float = 1.0, t_zero: float = 6, include_top: bool = False, num_classes: int = 10, compatibility: bool = False, downsampling_layers: List[int] = [5, 7], conv5_percent: float = 0.0, first_conv_stride: int = 2, residuals: bool = True, conv2d_input: bool = False, pool: bool = False, h_swish: bool = True, squeeze_excite: bool = True, divisor: int = 1, return_layers=None)[source]

Bases: Module

This class implements the PhiNet architecture.

Parameters:

input_shape (tuple) – Input resolution as (C, H, W).
num_layers (int) – Number of convolutional blocks.
alpha (float) – Width multiplier for PhiNet architecture.
beta (float) – Shape factor of PhiNet.
t_zero (float) – Base expansion factor for PhiNet.
include_top (bool) – Whether to include classification head or not.
num_classes (int) – Number of classes for the classification head.
compatibility (bool) – True to maximise compatibility among embedded platforms (changes network).

forward(x)[source]

Executes PhiNet network

Parameters:: x (torch.Tensor) – Network input.
Returns:: Logits if `include_top=True`, otherwise embeddings
Return type:: torch.Tensor

get_MAC()[source]

Returns number of MACs for this architecture.

Returns:: Number of MAC for this network.
Return type:: int

Example

>>> from micromind.networks import PhiNet
>>> model = PhiNet((3, 224, 224))
>>> model.get_MAC()
9817670

get_complexity()[source]

Returns MAC and number of parameters of initialized architecture.

Returns:: Dictionary with complexity characterization of the network.
Return type:: dict

Example

>>> from micromind.networks import PhiNet
>>> model = PhiNet((3, 224, 224))
>>> model.get_complexity()
{'MAC': 9817670, 'params': 30917}

get_params()[source]

Returns number of params for this architecture.

Returns:: Number of parameters for this network.
Return type:: int

Example

>>> from micromind.networks import PhiNet
>>> model = PhiNet((3, 224, 224))
>>> model.get_params()
30917

class micromind.networks.phinet.PhiNetConvBlock(in_shape, expansion, stride, filters, has_se, block_id=None, res=True, h_swish=True, k_size=3, dp_rate=0.05, divisor=1)[source]

Bases: Module

Implements PhiNet’s convolutional block.

Parameters:

in_shape (tuple) – Input shape of the conv block.
expansion (float) – Expansion coefficient for this convolutional block.
stride (int) – Stride for the conv block.
filters (int) – Output channels of the convolutional block.
block_id (int) – ID of the convolutional block.
has_se (bool) – Whether to include use Squeeze and Excite or not.
res (bool) – Whether to use the residual connection or not.
h_swish (bool) – Whether to use HSwish or not.
k_size (int) – Kernel size for the depthwise convolution.

forward(x)[source]

Executes the PhiNet convolutional block.

Parameters:: x (torch.Tensor) – Input to the convolutional block.
Returns:: Output of the convolutional block.
Return type:: torch.Tensor

class micromind.networks.phinet.ReLUMax(max)[source]

Bases: Module

Implements ReLUMax.

Parameters:: max_value (float) – The maximum value for the clamp operation.

forward(x)[source]

Forward pass of ReLUMax.

Parameters:: x (torch.Tensor) – Input tensor.
Returns:: Output tensor after applying ReLU with max value.
Return type:: torch.Tensor

class micromind.networks.phinet.SEBlock(in_channels, out_channels, h_swish=True)[source]

Bases: Module

Implements squeeze-and-excitation block.

Parameters:

in_channels (int) – Input number of channels.
out_channels (int) – Output number of channels.
h_swish (bool, optional) – Whether to use the h_swish (default is True).

forward(x)[source]

Executes the squeeze-and-excitation block.

Parameters:: x (torch.Tensor) – Input tensor.
Returns:: Output of the squeeze-and-excitation block.
Return type:: torch.Tensor

class micromind.networks.phinet.SeparableConv2d(in_channels, out_channels, activation=<function relu>, kernel_size=3, stride=1, padding=0, dilation=1, bias=True, padding_mode='zeros', depth_multiplier=1)[source]

Bases: Module

Implements SeparableConv2d.

Parameters:

in_channels (int) – Input number of channels.
out_channels (int) – Output number of channels.
activation (function, optional) – Activation function to apply (default is torch.nn.functional.relu).
kernel_size (int, optional) – Kernel size (default is 3).
stride (int, optional) – Stride for convolution (default is 1).
padding (int, optional) – Padding for convolution (default is 0).
dilation (int, optional) – Dilation factor for convolution (default is 1).
bias (bool, optional) – If True, adds a learnable bias to the output (default is True).
padding_mode (str, optional) – Padding mode for convolution (default is ‘zeros’).
depth_multiplier (int, optional) – Depth multiplier (default is 1).

forward(x)[source]

Executes the SeparableConv2d block.

Parameters:: x (torch.Tensor) – Input tensor.
Returns:: Output of the convolution.
Return type:: torch.Tensor

micromind.networks.phinet.correct_pad(input_shape, kernel_size)[source]

Returns a tuple for zero-padding for 2D convolution with downsampling.

Parameters:

input_shape (tuple or list) – Shape of the input tensor (height, width).
kernel_size (int or tuple) – Size of the convolution kernel.

Returns:

A tuple representing the zero-padding in the format (left, right, top, bottom).

Return type:

tuple

micromind.networks.phinet.get_xpansion_factor(t_zero, beta, block_id, num_blocks)[source]

Compute the expansion factor based on the formula from the paper.

Parameters:

t_zero (float) – The base expansion factor.
beta (float) – The shape factor.
block_id (int) – The identifier of the current block.
num_blocks (int) – The total number of blocks.

Returns:

The computed expansion factor.

Return type:

float

micromind.networks.phinet.preprocess_input(x, **kwargs)[source]

Normalize input channels between [-1, 1].

Parameters:: x (torch.Tensor) – Input tensor to be preprocessed.
Returns:: Normalized tensor with values between [-1, 1].
Return type:: torch.Tensor

micromind.networks.xinet module

Code for XiNet (https://shorturl.at/mtHT0)

Authors:

Francesco Paissan, 2023
Alberto Ancilotto, 2023

Bases: Module

Implements XiNet’s convolutional block as presented in the original paper.

Parameters:

c_in (int) – Number of input channels.
c_out (int) – Number of output channels.
kernel_size (Union[int, Tuple]) – Kernel size for the main convolution.
stride (Union[int, Tuple]) – Stride for the main convolution.
padding (Optional[Union[int, Tuple]]) – Padding that is applied in the main convolution.
groups (Optional[int]) – Number of groups for the main convolution.
act (Optional[bool]) – When True, uses SiLU activation function after the main convolution.
gamma (Optional[float]) – Compression factor for the convolutional block.
attention (Optional[bool]) – When True, uses attention.
skip_tensor_in (Optional[bool]) – When True, defines broadcasting skip connection block.
skip_res (Optional[List]) – Spatial resolution of the skip connection, such that average pooling is statically defined.
skip_channels (Optional[int]) – Number of channels for the input block.
pool (Optional[bool]) – When True, applies pooling after the main convolution.
attention_k (Optional[int]) – Kernel for the attention module.
attention_lite (Optional[bool]) – When True, uses efficient attention implementation.
batchnorm (Optional[bool]) – When True, uses batch normalization inside the ConvBlock.
dropout_rate (Optional[int]) – Dropout probability.
skip_k (Optional[int]) – Kernel for the broadcast skip connection.

forward(x: Tensor)[source]

Computes the forward step of the XiNet’s convolutional block. :param x: Input tensor. :type x: torch.Tensor

Returns:: ConvBlock output.
Return type:: torch.Tensor

class micromind.networks.xinet.XiNet(input_shape: List, alpha: float = 1.0, gamma: float = 4.0, num_layers: int = 5, num_classes=1000, include_top=False, base_filters: int = 16, return_layers: List | None = None)[source]

Bases: Module

Defines a XiNet.

Parameters:

input_shape (List) – Shape of the input tensor.
alpha (float) – Width multiplier.
gamma (float) – Compression factor.
num_layers (int = 5) – Number of convolutional blocks.
num_classes (int) – Number of classes. It is used only when include_top is True.
include_top (Optional[bool]) – When True, defines an MLP for classification.
base_filters (int) – Number of base filters for the ConvBlock.
return_layers (Optional[List]) – Ids of the layers to be returned after processing the foward step.

Example

>>> from micromind.networks import XiNet
>>> model = XiNet((3, 224, 224))

forward(x)[source]

Computes the forward step of the XiNet. :param x: Input tensor. :type x: torch.Tensor

Returns:: the init. : Union[torch.Tensor, Tuple]
Return type:: Output of the network, as defined from

micromind.networks.xinet.autopad(k: int, p: int | None = None)[source]: Implements padding to mimic “same” behaviour. :param k: Kernel size for the convolution. :type k: int :param p: Padding value to be applied. :type p: Optional[int]

micromind.networks.yolo module

YOLOv8 building blocks.

Authors:

Matteo Beltrami, 2023
Francesco Paissan, 2023

This file contains the definition of the building blocks of the yolov8 network. Model architecture has been taken from https://github.com/ultralytics/ultralytics/issues/189

class micromind.networks.yolo.Bottleneck(c1, c2, shortcut: bool, groups=1, kernels: list = (3, 3), channel_factor=0.5)[source]

Bases: Module

Implements YOLOv8’s bottleneck block.

Parameters:

c1 (int) – Input channels of the bottleneck block.
c2 (int) – Output channels of the bottleneck block.
shortcut (bool) – Decides whether to perform a shortcut in the bottleneck block.
groups (int) – Groups for the bottleneck block.
kernels (list) – Kernel size for the bottleneck block.
channel_factor (float) – Decides the number of channels of the intermediate result between the two convolutional blocks.

forward(x)[source]

Executes YOLOv8 bottleneck block.

Parameters:: x (torch.Tensor) – Input to the bottleneck block.
Returns:: Ouput of the bottleneck block
Return type:: torch.Tensor

class micromind.networks.yolo.C2f(c1, c2, n=1, shortcut=False, groups=1, e=0.5)[source]

Bases: Module

Implements YOLOv8’s C2f block.

Parameters:

c1 (int) – Input channels of the C2f block.
c2 (int) – Output channels of the C2f block.
n (int) – Number of bottleck blocks executed in the C2f block.
shortcut (bool) – Decides whether to perform a shortcut in the bottleneck blocks.
groups (int) – Groups for the C2f block.
e (float) – Factor for cancatenating intermeidate results.

forward(x)[source]

Executes YOLOv8 C2f block.

Parameters:: x (torch.Tensor) – Input to the C2f block.
Returns:: Ouput of the C2f block
Return type:: torch.Tensor

class micromind.networks.yolo.Conv(c1, c2, kernel_size=1, stride=1, padding=None, dilation=1, groups=1)[source]

Bases: Module

Implements YOLOv8’s convolutional block

Parameters:

c1 (int) – Input channels of the convolutional block.
c2 (int) – Output channels of the convolutional block.
kernel_size (int) – Kernel size for the convolutional block.
stride (int) – Stride for the convolutional block.
padding (int) – Padding for the convolutional block.
dilation (int) – Dilation for the convolutional block.
groups (int) – Groups for the convolutional block.

forward(x)[source]

Executes YOLOv8 convolutional block.

Parameters:: x (torch.Tensor) – Input to the convolutional block.
Returns:: Ouput of the convolutional block
Return type:: torch.Tensor

class micromind.networks.yolo.DFL(c1=16)[source]

Bases: Module

Implements YOLOv8’s DFL block.

Parameters:: c1 (int) – Input channels of the DFL block.

forward(x)[source]

Executes YOLOv8 DFL block.

Parameters:: x (torch.Tensor) – Input to the DFL block.
Returns:: Ouput of the DFL block
Return type:: torch.Tensor

class micromind.networks.yolo.Darknet(w, r, d)[source]

Bases: Module

Implements YOLOv8’s convolutional backbone.

Parameters:

w (float) – Width multiple of the Darknet.
r (float) – Ratio multiple of the Darknet.
d (float) – Depth multiple of the Darknet.

forward(x)[source]

Executes YOLOv8 convolutional backbone.

Parameters:: x (torch.Tensor) – Input to the Darknet.
Returns:: Three intermediate representations with different resolutions
Return type:: tuple

class micromind.networks.yolo.DetectionHead(nc=80, filters=())[source]

Bases: Module

Implements YOLOv8’s detection head.

Parameters:

nc (int) – Number of classes to predict.
filters (tuple) – Number of channels of the three inputs of the detection head.

forward(x)[source]

Executes YOLOv8 detection head.

Parameters:: x (list) – Input to the detection head.
Returns:: Output of the detection head
Return type:: torch.Tensor

class micromind.networks.yolo.SPPF(c1, c2, k=5)[source]

Bases: Module

Implements YOLOv8’s SPPF block.

Parameters:

c1 (int) – Input channels of the SPPF block.
c2 (int) – Output channels of the SPPF block.
k (int) – Kernel size for the SPPF block Maxpooling operations

forward(x)[source]

Executes YOLOv8 SPPF block.

Parameters:: x (torch.Tensor) – Input to the SPPF block.
Returns:: Ouput of the SPPF block
Return type:: torch.Tensor

class micromind.networks.yolo.Upsample(scale_factor, mode='nearest')[source]: Bases: object

class micromind.networks.yolo.YOLOv8(w, r, d, num_classes=80)[source]

Bases: Module

Implements YOLOv8 network.

Parameters:

w (float) – Width multiple of the Darknet.
r (float) – Ratio multiple of the Darknet.
d (float) – Depth multiple of the Darknet.
num_classes (int) – Number of classes to predict.

forward(x)[source]

Executes YOLOv8 network.

Parameters:: x (torch.Tensor) – Input to the YOLOv8 network.
Returns:: Output of the YOLOv8 network
Return type:: torch.Tensor

class micromind.networks.yolo.Yolov8Neck(filters=[256, 512, 768], up=[2, 2], d=1)[source]

Bases: Module

Implements YOLOv8’s neck.

Parameters:

w (float) – Width multiple of the Darknet.
r (float) – Ratio multiple of the Darknet.
d (float) – Depth multiple of the Darknet.

forward(p3, p4, p5)[source]

Executes YOLOv8 neck.

Parameters:: x (tuple) – Input to the neck.
Returns:: Three intermediate representations with different resolutions
Return type:: list