micromind.networks package
Submodules
micromind.networks.phinet module
Code for PhiNets (https://doi.org/10.1145/3510832).
- Authors:
Francesco Paissan, 2023
Alberto Ancilotto, 2023
Matteo Beltrami, 2023
Matteo Tremonti, 2023
- class micromind.networks.phinet.DepthwiseConv2d(in_channels, depth_multiplier=1, kernel_size=3, stride=1, padding=0, dilation=1, bias=False, padding_mode='zeros')[source]
Bases:
Conv2d
Depthwise 2D convolution layer.
- Parameters:
in_channels (int) – Number of input channels.
depth_multiplier (int, optional) – The channel multiplier for the output channels (default is 1).
kernel_size (int or tuple, optional) – Size of the convolution kernel (default is 3).
stride (int or tuple, optional) – Stride of the convolution (default is 1).
padding (int or tuple, optional) – Zero-padding added to both sides of the input (default is 0).
dilation (int or tuple, optional) – Spacing between kernel elements (default is 1).
bias (bool, optional) – If True, adds a learnable bias to the output (default is False).
padding_mode (str, optional) – ‘zeros’ or ‘circular’. Padding mode for convolution (default is ‘zeros’).
- class micromind.networks.phinet.PhiNet(input_shape: List[int], num_layers: int = 7, alpha: float = 0.2, beta: float = 1.0, t_zero: float = 6, include_top: bool = False, num_classes: int = 10, compatibility: bool = False, downsampling_layers: List[int] = [5, 7], conv5_percent: float = 0.0, first_conv_stride: int = 2, residuals: bool = True, conv2d_input: bool = False, pool: bool = False, h_swish: bool = True, squeeze_excite: bool = True, divisor: int = 1, return_layers=None)[source]
Bases:
Module
This class implements the PhiNet architecture.
- Parameters:
input_shape (tuple) – Input resolution as (C, H, W).
num_layers (int) – Number of convolutional blocks.
alpha (float) – Width multiplier for PhiNet architecture.
beta (float) – Shape factor of PhiNet.
t_zero (float) – Base expansion factor for PhiNet.
include_top (bool) – Whether to include classification head or not.
num_classes (int) – Number of classes for the classification head.
compatibility (bool) – True to maximise compatibility among embedded platforms (changes network).
- forward(x)[source]
Executes PhiNet network
- Parameters:
x (torch.Tensor) – Network input.
- Returns:
Logits if `include_top=True`, otherwise embeddings
- Return type:
torch.Tensor
- get_MAC()[source]
Returns number of MACs for this architecture.
- Returns:
Number of MAC for this network.
- Return type:
int
Example
>>> from micromind.networks import PhiNet >>> model = PhiNet((3, 224, 224)) >>> model.get_MAC() 9817670
- get_complexity()[source]
Returns MAC and number of parameters of initialized architecture.
- Returns:
Dictionary with complexity characterization of the network.
- Return type:
dict
Example
>>> from micromind.networks import PhiNet >>> model = PhiNet((3, 224, 224)) >>> model.get_complexity() {'MAC': 9817670, 'params': 30917}
- class micromind.networks.phinet.PhiNetConvBlock(in_shape, expansion, stride, filters, has_se, block_id=None, res=True, h_swish=True, k_size=3, dp_rate=0.05, divisor=1)[source]
Bases:
Module
Implements PhiNet’s convolutional block.
- Parameters:
in_shape (tuple) – Input shape of the conv block.
expansion (float) – Expansion coefficient for this convolutional block.
stride (int) – Stride for the conv block.
filters (int) – Output channels of the convolutional block.
block_id (int) – ID of the convolutional block.
has_se (bool) – Whether to include use Squeeze and Excite or not.
res (bool) – Whether to use the residual connection or not.
h_swish (bool) – Whether to use HSwish or not.
k_size (int) – Kernel size for the depthwise convolution.
- class micromind.networks.phinet.ReLUMax(max)[source]
Bases:
Module
Implements ReLUMax.
- Parameters:
max_value (float) – The maximum value for the clamp operation.
- class micromind.networks.phinet.SEBlock(in_channels, out_channels, h_swish=True)[source]
Bases:
Module
Implements squeeze-and-excitation block.
- Parameters:
in_channels (int) – Input number of channels.
out_channels (int) – Output number of channels.
h_swish (bool, optional) – Whether to use the h_swish (default is True).
- class micromind.networks.phinet.SeparableConv2d(in_channels, out_channels, activation=<function relu>, kernel_size=3, stride=1, padding=0, dilation=1, bias=True, padding_mode='zeros', depth_multiplier=1)[source]
Bases:
Module
Implements SeparableConv2d.
- Parameters:
in_channels (int) – Input number of channels.
out_channels (int) – Output number of channels.
activation (function, optional) – Activation function to apply (default is torch.nn.functional.relu).
kernel_size (int, optional) – Kernel size (default is 3).
stride (int, optional) – Stride for convolution (default is 1).
padding (int, optional) – Padding for convolution (default is 0).
dilation (int, optional) – Dilation factor for convolution (default is 1).
bias (bool, optional) – If True, adds a learnable bias to the output (default is True).
padding_mode (str, optional) – Padding mode for convolution (default is ‘zeros’).
depth_multiplier (int, optional) – Depth multiplier (default is 1).
- micromind.networks.phinet.correct_pad(input_shape, kernel_size)[source]
Returns a tuple for zero-padding for 2D convolution with downsampling.
- Parameters:
input_shape (tuple or list) – Shape of the input tensor (height, width).
kernel_size (int or tuple) – Size of the convolution kernel.
- Returns:
A tuple representing the zero-padding in the format (left, right, top, bottom).
- Return type:
tuple
- micromind.networks.phinet.get_xpansion_factor(t_zero, beta, block_id, num_blocks)[source]
Compute the expansion factor based on the formula from the paper.
- Parameters:
t_zero (float) – The base expansion factor.
beta (float) – The shape factor.
block_id (int) – The identifier of the current block.
num_blocks (int) – The total number of blocks.
- Returns:
The computed expansion factor.
- Return type:
float
micromind.networks.xinet module
Code for XiNet (https://shorturl.at/mtHT0)
- Authors:
Francesco Paissan, 2023
Alberto Ancilotto, 2023
- class micromind.networks.xinet.XiConv(c_in: int, c_out: int, kernel_size: int | Tuple = 3, stride: int | Tuple = 1, padding: int | Tuple | None = None, groups: int | None = 1, act: bool | None = True, gamma: float | None = 4, attention: bool | None = True, skip_tensor_in: bool | None = True, skip_res: List | None = None, skip_channels: int | None = 1, pool: bool | None = None, attention_k: int | None = 3, attention_lite: bool | None = True, batchnorm: bool | None = True, dropout_rate: int | None = 0, skip_k: int | None = 1)[source]
Bases:
Module
Implements XiNet’s convolutional block as presented in the original paper.
- Parameters:
c_in (int) – Number of input channels.
c_out (int) – Number of output channels.
kernel_size (Union[int, Tuple]) – Kernel size for the main convolution.
stride (Union[int, Tuple]) – Stride for the main convolution.
padding (Optional[Union[int, Tuple]]) – Padding that is applied in the main convolution.
groups (Optional[int]) – Number of groups for the main convolution.
act (Optional[bool]) – When True, uses SiLU activation function after the main convolution.
gamma (Optional[float]) – Compression factor for the convolutional block.
attention (Optional[bool]) – When True, uses attention.
skip_tensor_in (Optional[bool]) – When True, defines broadcasting skip connection block.
skip_res (Optional[List]) – Spatial resolution of the skip connection, such that average pooling is statically defined.
skip_channels (Optional[int]) – Number of channels for the input block.
pool (Optional[bool]) – When True, applies pooling after the main convolution.
attention_k (Optional[int]) – Kernel for the attention module.
attention_lite (Optional[bool]) – When True, uses efficient attention implementation.
batchnorm (Optional[bool]) – When True, uses batch normalization inside the ConvBlock.
dropout_rate (Optional[int]) – Dropout probability.
skip_k (Optional[int]) – Kernel for the broadcast skip connection.
- class micromind.networks.xinet.XiNet(input_shape: List, alpha: float = 1.0, gamma: float = 4.0, num_layers: int = 5, num_classes=1000, include_top=False, base_filters: int = 16, return_layers: List | None = None)[source]
Bases:
Module
Defines a XiNet.
- Parameters:
input_shape (List) – Shape of the input tensor.
alpha (float) – Width multiplier.
gamma (float) – Compression factor.
num_layers (int = 5) – Number of convolutional blocks.
num_classes (int) – Number of classes. It is used only when include_top is True.
include_top (Optional[bool]) – When True, defines an MLP for classification.
base_filters (int) – Number of base filters for the ConvBlock.
return_layers (Optional[List]) – Ids of the layers to be returned after processing the foward step.
Example
>>> from micromind.networks import XiNet >>> model = XiNet((3, 224, 224))
micromind.networks.yolo module
YOLOv8 building blocks.
- Authors:
Matteo Beltrami, 2023
Francesco Paissan, 2023
This file contains the definition of the building blocks of the yolov8 network. Model architecture has been taken from https://github.com/ultralytics/ultralytics/issues/189
- class micromind.networks.yolo.Bottleneck(c1, c2, shortcut: bool, groups=1, kernels: list = (3, 3), channel_factor=0.5)[source]
Bases:
Module
Implements YOLOv8’s bottleneck block.
- Parameters:
c1 (int) – Input channels of the bottleneck block.
c2 (int) – Output channels of the bottleneck block.
shortcut (bool) – Decides whether to perform a shortcut in the bottleneck block.
groups (int) – Groups for the bottleneck block.
kernels (list) – Kernel size for the bottleneck block.
channel_factor (float) – Decides the number of channels of the intermediate result between the two convolutional blocks.
- class micromind.networks.yolo.C2f(c1, c2, n=1, shortcut=False, groups=1, e=0.5)[source]
Bases:
Module
Implements YOLOv8’s C2f block.
- Parameters:
c1 (int) – Input channels of the C2f block.
c2 (int) – Output channels of the C2f block.
n (int) – Number of bottleck blocks executed in the C2f block.
shortcut (bool) – Decides whether to perform a shortcut in the bottleneck blocks.
groups (int) – Groups for the C2f block.
e (float) – Factor for cancatenating intermeidate results.
- class micromind.networks.yolo.Conv(c1, c2, kernel_size=1, stride=1, padding=None, dilation=1, groups=1)[source]
Bases:
Module
Implements YOLOv8’s convolutional block
- Parameters:
c1 (int) – Input channels of the convolutional block.
c2 (int) – Output channels of the convolutional block.
kernel_size (int) – Kernel size for the convolutional block.
stride (int) – Stride for the convolutional block.
padding (int) – Padding for the convolutional block.
dilation (int) – Dilation for the convolutional block.
groups (int) – Groups for the convolutional block.
- class micromind.networks.yolo.DFL(c1=16)[source]
Bases:
Module
Implements YOLOv8’s DFL block.
- Parameters:
c1 (int) – Input channels of the DFL block.
- class micromind.networks.yolo.Darknet(w, r, d)[source]
Bases:
Module
Implements YOLOv8’s convolutional backbone.
- Parameters:
w (float) – Width multiple of the Darknet.
r (float) – Ratio multiple of the Darknet.
d (float) – Depth multiple of the Darknet.
- class micromind.networks.yolo.DetectionHead(nc=80, filters=())[source]
Bases:
Module
Implements YOLOv8’s detection head.
- Parameters:
nc (int) – Number of classes to predict.
filters (tuple) – Number of channels of the three inputs of the detection head.
- class micromind.networks.yolo.SPPF(c1, c2, k=5)[source]
Bases:
Module
Implements YOLOv8’s SPPF block.
- Parameters:
c1 (int) – Input channels of the SPPF block.
c2 (int) – Output channels of the SPPF block.
k (int) – Kernel size for the SPPF block Maxpooling operations
- class micromind.networks.yolo.YOLOv8(w, r, d, num_classes=80)[source]
Bases:
Module
Implements YOLOv8 network.
- Parameters:
w (float) – Width multiple of the Darknet.
r (float) – Ratio multiple of the Darknet.
d (float) – Depth multiple of the Darknet.
num_classes (int) – Number of classes to predict.