Advanced convert usage - Parameters explained by example

This tutorial provides a parameter-by-parameter explanation of the convert function. It explains what each parameter does, when it matters, and how to use it correctly, using small, practical examples.

The goal is to help users reason about quantization, conversion and device mapping instead of treating convert like a black box.

The tutorial is organized into thematic sections:

  1. Inputs and shapes

  2. Quantization-related parameters

  3. Device and mapping parameters

  4. Common pitfalls and best practices

For this tutorial we will use the akidanet_imagenet_224_alpha_1 from the MetaTF model zoo, after exporting it to ONNX format.

1. Preliminary steps

The convert function performs several distinct stages:

  1. Input shape fixing and model sanitization

  2. Compatibility checks

  3. Quantization

  4. Conversion to a hybrid model

  5. Optional device-aware mapping

The tutorial will go through the parameters that are listed below.

from onnx2akida import convert
import inspect

print(inspect.signature(convert))
(model, input_shape=None, input_dtype='uint8', samples=None, num_samples=1, device=None, enable_hwpr=False, sram_size=None, minimal_memory=False)

Load the model

import onnx
import os
import urllib.request

model_filename = "akidanet_imagenet_224_alpha_1.onnx"
model_url = "https://data.brainchip.com/models/AkidaV2/onnx_support/akidanet_imagenet_224_alpha_1.onnx"

_ = urllib.request.urlretrieve(model_url, model_filename)
model = onnx.load(os.path.abspath(model_filename))
print(f"Loaded model with {len(model.graph.node)} nodes")
Loaded model with 84 nodes

The minimal invocation of convert only requires the model.

from onnx2akida import print_report

hybrid_model, compatibility_info = convert(model)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  9.97it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  9.95it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.58it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.58it/s]
print_report(hybrid_model, compatibility_info)
[INFO]: Percentage of nodes compatible with akida: 100.0000 %

[INFO]: Number of mappable sequences on akida: 1

List of backends exchanges:
 • CPU -> Akida at layer node_Conv_1: 147.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 3.906 KB

3. Device and mapping parameters

Providing a device changes the conversion strategy itself, not just the final mapping. When a device is provided, conversion becomes device-aware and avoids generating Akida subgraphs that cannot fit. In other words, conversion will also ensure each Akida converted part will also map on the given device. If not, the part will be left in ONNX format.

import akida

device = akida.devices()[0]

hybrid_model, compatibility_info = convert(model, device=device)

print_report(hybrid_model, compatibility_info)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00, 10.28it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  3.04it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  3.04it/s]

Set of incompatible op_types: ['Clip', 'Conv']
List of incompatibilities:
 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D(op_type=Conv), node_Clip_5(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D'. Not enough hardware components of type CNP1 available. 30 are needed but 24 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D(op_type=Conv), node_Clip_7(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D'. Not enough hardware components of type CNP1 available. 32 are needed but 24 are available.

[INFO]: Percentage of nodes compatible with akida: 90.9091 %

[INFO]: Number of mappable sequences on akida: 2

List of backends exchanges:
 • CPU -> Akida at layer node_Conv_1: 147.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D: 784.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise: 392.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D: 1.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 1.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 3.906 KB

As shown in the report, only 90.9091% of the nodes could be mapped on the device. This is because some nodes could not be mapped due to insufficient resources on the provided device. e.g. “Reason: Cannot map layer ‘StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D’. Not enough hardware components of type CNP1 available. 30 are needed but 24 are available.”

Note that the two convolutions that could not be mapped due to insufficient resources follow one another in the network and they will run consecutively on the host processor. Therefore, the two akida models will be formed by the nodes before and after these two unmapped nodes. If the two nodes were not consecutive, we would need 3 akida models to cover the whole network.

It is possible to print the hybrid model summary to visualize the different akida models created during conversion and the unmapped nodes which form ONNX sub-models.

hybrid_model.summary()
                                                                                                     HybridModel Summary: HybridModel
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Layer (type)                                                                                           Output shape    Inbounds                                                                             Data movement

========================================================================================================= --- ONNX Sub-model 0 --- ========================================================================================================

680264bd-f75a-4018-be4d-84937b5336db/quantizer_0 (InputQuantizer)                                      [3, 224, 224]   input                                                                                N/A

======================================================================================================== --- Akida Sub-model 1 --- ========================================================================================================

node_Conv_1 (InputConv2D)                                                                              [112, 112, 32]  680264bd-f75a-4018-be4d-84937b5336db/quantize_0                                      147.00 KB (CPU -> Akida)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D (Conv2D)                                  [112, 112, 64]  StatefulPartitionedCall/akidanet_1.00_160_1000/conv_0/relu/clip_by_value:0           784.00 KB (Akida -> CPU)

========================================================================================================= --- ONNX Sub-model 2 --- ========================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)  [128, 56, 56]   StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/relu/clip_by_value:0           N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)  [128, 56, 56]   StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/relu/clip_by_value:0           N/A

======================================================================================================== --- Akida Sub-model 3 --- ========================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise (DepthwiseConv2D)              [28, 28, 128]   StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/relu/clip_by_value:0           392.00 KB (CPU -> Akida)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/Conv2D (Conv2D)                          [28, 28, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise (DepthwiseConv2D)              [28, 28, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/relu/clip_by_value:0   N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D (Conv2D)                          [28, 28, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise (DepthwiseConv2D)              [14, 14, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/relu/clip_by_value:0   N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/Conv2D (Conv2D)                          [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise (DepthwiseConv2D)              [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/relu/clip_by_value:0   N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D (Conv2D)                          [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise (DepthwiseConv2D)              [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/relu/clip_by_value:0   N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D (Conv2D)                          [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise (DepthwiseConv2D)              [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/relu/clip_by_value:0   N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D (Conv2D)                          [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise (DepthwiseConv2D)             [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/relu/clip_by_value:0   N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D (Conv2D)                         [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise:0           N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise (DepthwiseConv2D)             [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/relu/clip_by_value:0  N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D (Conv2D)                         [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise:0           N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise (DepthwiseConv2D)             [7, 7, 512]     StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/relu/clip_by_value:0  N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D (Conv2D)                         [7, 7, 1024]    StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise:0           N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise (DepthwiseConv2D)             [7, 7, 1024]    StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/relu/clip_by_value:0  N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D (Conv2D)                         [1, 1, 1024]    StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise:0           1.00 KB (Akida -> CPU)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion (Dense1D)             [1, 1, 1000]    StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/global_avg/Mean:0     4.91 KB (CPU -> Akida -> CPU)

========================================================================================================= --- ONNX Sub-model 4 --- ========================================================================================================

5f6d5b8f-a95d-4215-a755-5010c2e49173/dequantizer_0 (Dequantizer)                                       [1000]          classifier/to_dequantize                                                             N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

It is also possible to use a virtual device which is a simulated version of a hardware device. We provide two different devices within the akida package: akida.TwoNodesIPv2() and akida.SixNodesIPv2() which is the software representation of the physical Akida v2 device used when converting the model above.

Let’s try using the 2 nodes virtual device in convert.

hybrid_model, compatibility_info = convert(model, device=akida.TwoNodesIPv2())

print_report(hybrid_model, compatibility_info)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  8.49it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  8.48it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  2.94it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  2.94it/s]

Set of incompatible op_types: ['Clip', 'Conv', 'GlobalAveragePool']
List of incompatibilities:
 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D(op_type=Conv), node_Clip_5(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D'. Not enough hardware components of type CNP1 available. 30 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D(op_type=Conv), node_Clip_7(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D'. Not enough hardware components of type CNP1 available. 32 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D(op_type=Conv), node_Clip_15(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D'. Not enough hardware components of type CNP1 available. 12 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D(op_type=Conv), node_Clip_17(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D'. Not enough hardware components of type CNP1 available. 12 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D(op_type=Conv), node_Clip_19(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D'. Not enough hardware components of type CNP1 available. 12 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D(op_type=Conv), node_Clip_21(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D'. Not enough hardware components of type CNP1 available. 12 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D(op_type=Conv), node_Clip_23(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D'. Not enough hardware components of type CNP1 available. 12 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D(op_type=Conv), node_Clip_25(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D'. Not enough hardware components of type CNP1 available. 11 are needed but 8 are available.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D(op_type=Conv), node_Clip_27(op_type=Clip), StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/global_avg/Mean(op_type=GlobalAveragePool)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D'. Not enough hardware components of type CNP1 available. 22 are needed but 8 are available.

[INFO]: Percentage of nodes compatible with akida: 56.8182 %

[INFO]: Number of mappable sequences on akida: 9

List of backends exchanges:
 • CPU -> Akida at layer node_Conv_1: 147.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D: 784.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise: 392.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/Conv2D: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise: 196.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D: 196.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D: 196.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise: 196.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise: 24.500 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise: 49.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise: 49.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 1.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 3.906 KB
hybrid_model.summary()
                                                                                                                 HybridModel Summary: HybridModel
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Layer (type)                                                                                                                 Output shape    Inbounds                                                                             Data movement

===================================================================================================================== --- ONNX Sub-model 0 --- ====================================================================================================================

5561163d-5f83-41c9-9841-631fe40e5e1b/quantizer_0 (InputQuantizer)                                                            [3, 224, 224]   input                                                                                N/A

==================================================================================================================== --- Akida Sub-model 1 --- ====================================================================================================================

node_Conv_1 (InputConv2D)                                                                                                    [112, 112, 32]  5561163d-5f83-41c9-9841-631fe40e5e1b/quantize_0                                      147.00 KB (CPU -> Akida)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D (Conv2D)                                                        [112, 112, 64]  StatefulPartitionedCall/akidanet_1.00_160_1000/conv_0/relu/clip_by_value:0           784.00 KB (Akida -> CPU)

===================================================================================================================== --- ONNX Sub-model 2 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)                        [128, 56, 56]   StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/relu/clip_by_value:0           N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)                        [128, 56, 56]   StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/relu/clip_by_value:0           N/A

==================================================================================================================== --- Akida Sub-model 3 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise (DepthwiseConv2D)                                    [28, 28, 128]   StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/relu/clip_by_value:0           490.00 KB (CPU -> Akida -> CPU)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/Conv2D (Conv2D)                                                [28, 28, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise:0            98.00 KB (CPU -> Akida)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise (DepthwiseConv2D)                                    [28, 28, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/relu/clip_by_value:0   196.00 KB (Akida -> CPU)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D (Conv2D)                                                [28, 28, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise:0            392.00 KB (CPU -> Akida -> CPU)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise (DepthwiseConv2D)                                    [14, 14, 256]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/relu/clip_by_value:0   196.00 KB (CPU -> Akida)
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/Conv2D (Conv2D)                                                [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise:0            N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise (DepthwiseConv2D)                                    [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/relu/clip_by_value:0   98.00 KB (Akida -> CPU)

===================================================================================================================== --- ONNX Sub-model 4 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)                [512, 14, 14]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise:0            N/A

==================================================================================================================== --- Akida Sub-model 5 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise (DepthwiseConv2D)                                    [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/relu/clip_by_value:0   196.00 KB (CPU -> Akida -> CPU)

===================================================================================================================== --- ONNX Sub-model 6 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)                [512, 14, 14]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise:0            N/A

==================================================================================================================== --- Akida Sub-model 7 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise (DepthwiseConv2D)                                    [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/relu/clip_by_value:0   196.00 KB (CPU -> Akida -> CPU)

===================================================================================================================== --- ONNX Sub-model 8 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)                [512, 14, 14]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise:0            N/A

==================================================================================================================== --- Akida Sub-model 9 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise (DepthwiseConv2D)                                   [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/relu/clip_by_value:0   196.00 KB (CPU -> Akida -> CPU)

==================================================================================================================== --- ONNX Sub-model 10 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)               [512, 14, 14]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise:0           N/A

==================================================================================================================== --- Akida Sub-model 11 --- ===================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise (DepthwiseConv2D)                                   [14, 14, 512]   StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/relu/clip_by_value:0  196.00 KB (CPU -> Akida -> CPU)

==================================================================================================================== --- ONNX Sub-model 12 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)               [512, 14, 14]   StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise:0           N/A

==================================================================================================================== --- Akida Sub-model 13 --- ===================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise (DepthwiseConv2D)                                   [7, 7, 512]     StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/relu/clip_by_value:0  122.50 KB (CPU -> Akida -> CPU)

==================================================================================================================== --- ONNX Sub-model 14 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D (QuantizedConv2DBiasedReLUClippedScaled)               [1024, 7, 7]    StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise:0           N/A

==================================================================================================================== --- Akida Sub-model 15 --- ===================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise (DepthwiseConv2D)                                   [7, 7, 1024]    StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/relu/clip_by_value:0  98.00 KB (CPU -> Akida -> CPU)

==================================================================================================================== --- ONNX Sub-model 16 --- ====================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D (QuantizedConv2DBiasedGlobalAvgPoolReLUClippedScaled)  [1024, 1, 1]    StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise:0           N/A

==================================================================================================================== --- Akida Sub-model 17 --- ===================================================================================================================

StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion (Dense1D)                                   [1, 1, 1000]    StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/global_avg/Mean:0     4.91 KB (CPU -> Akida -> CPU)

==================================================================================================================== --- ONNX Sub-model 18 --- ====================================================================================================================

1c5f979e-a660-4389-8107-f4dc4130e84d/dequantizer_0 (Dequantizer)                                                             [1000]          classifier/to_dequantize                                                             N/A
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

With the TwoNodes device, as shown in the report and the summary, 56.8182% of the nodes could be mapped and we would need 9 akida models to cover the whole network. The reason is that the TwoNodes device has much less resources than the SixNodes device and therefore less nodes could be mapped.

DO:
  • Use device in convert when targeting real hardware

  • Check the compatibility report to understand resource bottlenecks

  • Consider that consecutive unmapped nodes will form a single CPU-based segment

DON’T:
  • Expect 100% node mapping on small virtual devices

  • Ignore incompatibility reasons - they tell you exactly what resources are missing

4. Virtual device optimization computation

The following parameters (enable_hwpr, sram_size, minimal_memory) are advanced features intended for fine-tuning the virtual device computation when no physical device is provided. Most users should rely on the default settings, which work well for typical use cases. Only adjust these if you need to optimize for specific memory or resource constraints.

4.1 enable_hwpr

Let us first introduce the concept of Nodes and NPs in Akida devices:
  • An Akida Node contains a fixed number of 4 NPs (Neural Processors).

  • An Akida NP can be of type CNP (for convolutional layers), TNP_B (for spatiotemporal TENNs) or FNP (for dense layers).

  • The total number of NPs in a device is equal to the number of Nodes multiplied by the number of NPs per Node (4 NPs by node).

  • The term “Node” is also used to refer to a layer in an ONNX graph, which can be confusing. In a device context, “Node” refers to the hardware unit, while in a graph/model context, it refers to a layer.

When no device is provided, convert will return a hybrid model that is mapped on a virtual device created during conversion. If enable_hwpr is set to False (default), the virtual device assumes that the final device must have enough resources to fit each submodel at once. Let’s inspect the obtained device.

hybrid_model, compatibility_info = convert(model)

print_report(hybrid_model, compatibility_info)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  9.12it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  9.10it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.58it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.58it/s]
[INFO]: Percentage of nodes compatible with akida: 100.0000 %

[INFO]: Number of mappable sequences on akida: 1

List of backends exchanges:
 • CPU -> Akida at layer node_Conv_1: 147.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 3.906 KB
def analyse_hybrid_model(hybrid_model):
    print(f"Number of NPs in final device: {len(hybrid_model.akida_models[0].device.mesh.nps)}")
    print(f"Number of Akida models in the hybrid model: {len(hybrid_model.akida_models)}")
    for i, akida_model in enumerate(hybrid_model.akida_models):
        print(f"\nAkida model {i}:")
        print(f"Number of layers in the Akida model: {len(akida_model.layers)}")
        print(f"Number of sequences in the Akida model: {len(akida_model.sequences)}")
        for j, sequence in enumerate(akida_model.sequences):
            print(f"  - sequence {j}:")
            print(f"    - number of passes: {len(sequence.passes)}")
            for p, passage in enumerate(sequence.passes):
                print(f"      - pass {p} --> {len(passage.layers)} layers")


analyse_hybrid_model(hybrid_model)
Number of NPs in final device: 208
Number of Akida models in the hybrid model: 1

Akida model 0:
Number of layers in the Akida model: 25
Number of sequences in the Akida model: 1
  - sequence 0:
    - number of passes: 1
      - pass 0 --> 25 layers

The summary above shows that with the default settings, we end up with one akida model containing one pass. The device should have at least 208 NPs (52 nodes) to be able to fit the whole submodel in one pass.

However, if we enable partial hardware reconfiguration (HWPR) by setting enable_hwpr to True, the virtual device created during conversion assumes that the device will be partially reconfigured between passes as much as possible. This generally allows fitting larger submodels on smaller devices because resources can be reused across passes.

hybrid_model_hwpr, compatibility_info_hwpr = convert(model, enable_hwpr=True)

print_report(hybrid_model_hwpr, compatibility_info_hwpr)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  8.35it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  8.33it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.52it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.52it/s]
[INFO]: Percentage of nodes compatible with akida: 100.0000 %

[INFO]: Number of mappable sequences on akida: 1

List of backends exchanges:
 • CPU -> Akida at layer node_Conv_1: 147.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 3.906 KB
analyse_hybrid_model(hybrid_model_hwpr)
Number of NPs in final device: 64
Number of Akida models in the hybrid model: 1

Akida model 0:
Number of layers in the Akida model: 25
Number of sequences in the Akida model: 1
  - sequence 0:
    - number of passes: 4
      - pass 0 --> 3 layers
      - pass 1 --> 8 layers
      - pass 2 --> 8 layers
      - pass 3 --> 6 layers

As we can see, with enable_hwpr set to True, we end up with the same percentage of mapped nodes (100%) and number of akida models, but the number of NPs in the device is much smaller (64 NPs, 8 nodes) However the akida model now contains 4 passes. The cost (clock count) of HWPR is negligible in the overall process, so using this parameter will only have an impact on the created virtual device (small with components reuse or as big as necessary to fit the whole model).

DO:
  • Use enable_hwpr=True to allow reusing HW resources and build a smaller device

  • Expect the final device to be auto-resized to fit each akida model

  • Remember: 1 Node = 4 NPs

DON’T:
  • Forget that HWPR allows resource reuse but doesn’t eliminate all splits

4.2 sram_size

Used only when device is None and minimal_memory is False. The sram_size is a tuple that contains the size of shared SRAM available inside the mesh for the input_buffer_memory and the weight memory. The input_buffer_memory is the SRAM needed to store input activations (intermediate feature maps) for each neural processing unit (NP). When data flows through the network, each layer receives inputs from the previous layer - this memory holds those inputs before processing. The weight memory is the SRAM needed to store the weights (parameters) of the layers assigned to each NP. During conversion, if sram_size is provided, the virtual device created will use this value to determine whether a submodel can fit in the available memory. Let’s see how to use this parameter.

hybrid_model, compatibility_info = convert(model, sram_size=akida.NP.SramSize(1024, 1024))

print_report(hybrid_model, compatibility_info)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  8.55it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  8.53it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:01<00:00,  1.03s/it]
Converting: 100%|██████████| 1/1 [00:01<00:00,  1.03s/it]

Set of incompatible op_types: ['Clip', 'Conv', 'GlobalAveragePool']
List of incompatibilities:
 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D(op_type=Conv), node_Clip_3(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_1/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D(op_type=Conv), node_Clip_5(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_2/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D(op_type=Conv), node_Clip_7(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/conv_3/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise(op_type=Conv)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_4/depthwise'. Error: Attempted to write value 512 into a 8-bit field, which will cause an overflow..

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/Conv2D(op_type=Conv), node_Clip_9(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_4/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise(op_type=Conv)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_5/depthwise'. Error: Attempted to write value 256 into a 8-bit field, which will cause an overflow..

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D(op_type=Conv), node_Clip_11(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_5/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise(op_type=Conv)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_6/depthwise'. Error: Attempted to write value 256 into a 8-bit field, which will cause an overflow..

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/Conv2D(op_type=Conv), node_Clip_13(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_6/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D(op_type=Conv), node_Clip_15(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_7/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D(op_type=Conv), node_Clip_17(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_8/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D(op_type=Conv), node_Clip_19(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_9/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D(op_type=Conv), node_Clip_21(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_10/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D(op_type=Conv), node_Clip_23(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_11/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D(op_type=Conv), node_Clip_25(op_type=Clip)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_12/Conv2D'. Filter size is too big to fit in a CNP: try decreasing the input channels or the bitwidth.

 ❌ Node sequence: [StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D(op_type=Conv), node_Clip_27(op_type=Clip), StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/global_avg/Mean(op_type=GlobalAveragePool)]
     • Stage: Mapping
     • Faulty node: StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D
     • Reason: Cannot map layer 'StatefulPartitionedCall/akidanet_1.00_160_1000/pw_separable_13/Conv2D'. When using Global Average Pooling, spatial split is impossible.

[INFO]: Percentage of nodes compatible with akida: 31.8182 %

[INFO]: Number of mappable sequences on akida: 9

List of backends exchanges:
 • CPU -> Akida at layer node_Conv_1: 147.000 KB
 • Akida -> CPU at layer node_Conv_1: 392.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_7/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_8/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_9/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_10/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_11/depthwise: 98.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise: 98.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_12/depthwise: 24.500 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise: 49.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/dw_separable_13/depthwise: 49.000 KB
 • CPU -> Akida at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 1.000 KB
 • Akida -> CPU at layer StatefulPartitionedCall/akidanet_1.00_160_1000/classifier/MatMul/MatMulAddFusion: 3.906 KB

The resulting hybrid models have as expected the same mapping results. However, the device created when minimal_memory is enabled has a smaller SRAM size for the input buffer memory and the same one for the weight. This is because the biggest layer assigned to the device requires less input buffer memory than the default 64 KB. However, the weight memory is being used at its maximum (50 KB) so it remains unchanged.

DO:
  • Set sram_size to match your target hardware specifications

  • Understand that smaller SRAM forces more layer splits and akida models

  • Remember: input_bytes stores activations, weight_bytes stores parameters

DON’T:
  • Use sram_size when device is provided (it will be ignored)

  • Set arbitrary values - base them on real hardware capabilities

  • Forget that 64 KB input + 50 KB weight is the v2 default

  • Use sram_size with minimal_memory=True (minimal_memory overrides it)

4.3 minimal_memory

Computes the minimal required input buffer and weights memory footprint and overrides the default sram_size that is used when sram_size is not explicitly provided, which is akida.NP.SramSize_v2 (64 KB for input buffer memory and 50 KB for weight memory). When enabled, the virtual device created during conversion assumes that the NPs have just enough memory to fit the biggest submodel assigned to them. Let’s see how this works.

hybrid_model, compatibility_info = convert(model)
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  9.40it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  9.38it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.57it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.57it/s]
print("device input buffer memory: "
      f"{hybrid_model.akida_models[0].device.mesh.np_sram_size.input_bytes}")

print("device weight memory: "
      f"{hybrid_model.akida_models[0].device.mesh.np_sram_size.weight_bytes}")

hybrid_model, compatibility_info = convert(model, minimal_memory=True)
device input buffer memory: 65536
device weight memory: 51200
Applied 28 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.
Applied 1 of general pattern rewrite rules.

Calibrating with 1/1.0 samples

Quantizing:   0%|          | 0/1 [00:00<?, ?it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  7.72it/s]
Quantizing: 100%|██████████| 1/1 [00:00<00:00,  7.71it/s]

Converting:   0%|          | 0/1 [00:00<?, ?it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.54it/s]
Converting: 100%|██████████| 1/1 [00:00<00:00,  1.54it/s]
print("device input buffer memory: "
      f"{hybrid_model.akida_models[0].device.mesh.np_sram_size.input_bytes}")

print("device weight memory: "
      f"{hybrid_model.akida_models[0].device.mesh.np_sram_size.weight_bytes}")
device input buffer memory: 64000
device weight memory: 51200

The resulting hybrid models have as expected the same mapping results. However, the device created when minimal_memory is enabled has a smaller SRAM size for the input buffer memory and the same one for the weight. This is because the biggest submodel assigned to an NP requires less input buffer memory than the default 64 KB. However, the weight memory is being used at its maximum (50 KB) so it remains unchanged.

DO:
  • Use minimal_memory=True to optimize device size for your specific model

  • Understand that it computes the maximum memory across all NPs

  • Use it when you want the smallest possible device that fits your model

  • Check the resulting np_sram_size to understand your model’s memory footprint

DON’T:
  • Combine minimal_memory=True with explicit sram_size (minimal_memory wins)

  • Expect minimal_memory to reduce the number of akida models

  • Use it if you need to match specific hardware SRAM specifications

  • Forget that it’s computing per-NP requirements, not total device memory

Total running time of the script: (1 minutes 1.142 seconds)

Gallery generated by Sphinx-Gallery