maix.nn

maix.nn module

You can use maix.nn to access this module with MaixPy
This module is generated from MaixPy and MaixCDK

Module

module	brief
F	maix.nn.F module

Enum

Layout

layer layout types

item	describe
values	NCHW: [N, C, H, W] NHWC: [N, H, W, C] UNKNOWN:

C++ defination code:

enum class Layout
    {
        NCHW = 0,    // [N, C, H, W]
        NHWC = 1,    // [N, H, W, C]
        UNKNOWN = 4
    }

SpeechDevice

speech device

item	describe
values	DEVICE_NONE: DEVICE_PCM: DEVICE_MIC: DEVICE_WAV:

C++ defination code:

enum class SpeechDevice {
    DEVICE_NONE = -1,
    DEVICE_PCM,
    DEVICE_MIC,
    DEVICE_WAV,
}

SpeechDecoder

speech decoder type

item	describe
values	DECODER_RAW: DECODER_DIG: DECODER_LVCSR: DECODER_KWS: DECODER_ALL:

C++ defination code:

enum class SpeechDecoder {
    DECODER_RAW   = 1,
    DECODER_DIG   = 2,
    DECODER_LVCSR = 4,
    DECODER_KWS   = 8,
    DECODER_ALL   = 65535,
}

Variable

Function

Class

Object

Object for detect result

C++ defination code:
class Object

init

def __init__(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], angle: float = -9999) -> None

Constructor of Object for detect result

item	description
type	func
param	x: left top x y: left top y w: width h: height class_id: class id score: score
static	False

C++ defination code:

Object(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), float angle = -9999)

str

def __str__(self) -> str

Object info to string

item	description
type	func
return	Object info string
static	False

C++ defination code:
std::string to_str()

get_obb_points

def get_obb_points(self) -> list[int]

Get OBB(oriented bounding box) points, auto calculated according to x,y,w,h,angle

item	description
type	func
static	False

C++ defination code:
std::vector<int> get_obb_points()

x

Object left top coordinate x

item	description
type	var
static	False
readonly	False

C++ defination code:
int x

y

Object left top coordinate y

item	description
type	var
static	False
readonly	False

C++ defination code:
int y

w

Object width

item	description
type	var
static	False
readonly	False

C++ defination code:
int w

h

Object height

item	description
type	var
static	False
readonly	False

C++ defination code:
int h

class_id

Object class id

item	description
type	var
static	False
readonly	False

C++ defination code:
int class_id

score

Object score

item	description
type	var
static	False
readonly	False

C++ defination code:
float score

points

keypoints

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> points

angle

Rotate angle, -9999 means not set, value is a percentage, need to multiply 180 to get the real angle or multiply PI to get the radian.

item	description
type	var
static	False
readonly	False

C++ defination code:
float angle

seg_mask

segmentation mask, uint8 list type, shape is h * w but flattened to one dimension, value fron 0 to 255.

item	description
type	var
attention	For efficiency, it's a pointer in C++, use this carefully!
static	False
readonly	False

C++ defination code:
image::Image *seg_mask

ObjectFloat

Object for detect result

C++ defination code:
class ObjectFloat

init

def __init__(self, x: float = 0, y: float = 0, w: float = 0, h: float = 0, class_id: float = 0, score: float = 0, points: list[float] = [], angle: float = -1) -> None

Constructor of Object for detect result

item	description
type	func
param	x: left top x y: left top y w: width h: height class_id: class id score: score
static	False

C++ defination code:

ObjectFloat(float x = 0, float y = 0, float w = 0, float h = 0, float class_id = 0, float score = 0, std::vector<float> points = std::vector<float>(), float angle = -1)

str

def __str__(self) -> str

Object info to string

item	description
type	func
return	Object info string
static	False

C++ defination code:
std::string to_str()

x

Object left top coordinate x

item	description
type	var
static	False
readonly	False

C++ defination code:
float x

y

Object left top coordinate y

item	description
type	var
static	False
readonly	False

C++ defination code:
float y

w

Object width

item	description
type	var
static	False
readonly	False

C++ defination code:
float w

h

Object height

item	description
type	var
static	False
readonly	False

C++ defination code:
float h

class_id

Object class id

item	description
type	var
static	False
readonly	False

C++ defination code:
float class_id

score

Object score

item	description
type	var
static	False
readonly	False

C++ defination code:
float score

points

keypoints

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> points

angle

Rotate angle

item	description
type	var
static	False
readonly	False

C++ defination code:
float angle

Objects

Objects Class for detect result

C++ defination code:
class Objects

init

def __init__(self) -> None

Constructor of Objects class

item	description
type	func
static	False

C++ defination code:
Objects()

add

def add(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], angle: float = -1) -> Object

Add object to objects

item	description
type	func
throw	Throw exception if no memory
static	False

C++ defination code:

nn::Object &add(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), float angle = -1)

remove

def remove(self, idx: int) -> maix.err.Err

Remove object form objects

item	description
type	func
static	False

C++ defination code:
err::Err remove(int idx)

at

def at(self, idx: int) -> Object

Get object item

item	description
type	func
static	False

C++ defination code:
nn::Object &at(int idx)

getitem

def __getitem__(self, idx: int) -> Object

Get object item

item	description
type	func
static	False

C++ defination code:
nn::Object &operator[](int idx)

len

def __len__(self) -> int

Get size

item	description
type	func
static	False

C++ defination code:
size_t size()

iter

def __iter__(self) -> typing.Iterator

Begin

item	description
type	func
static	False

C++ defination code:
std::vector<Object*>::iterator begin()

HandLandmarks

HandLandmarks class

C++ defination code:
class HandLandmarks

init

def __init__(self, model: str = '') -> None

Constructor of HandLandmarks class

item	description
type	func
param	model: model path, default empty, you can load model later by load function.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:
HandLandmarks(const string &model = "")

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.7, iou_th: float = 0.45, conf_th2: float = 0.8, landmarks_rel: bool = False) -> Objects

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Hand detect confidence threshold, default 0.7. iou_th: IoU threshold, default 0.45. conf_th2: Hand detect confidence second time check threshold, default 0.8. landmarks_rel: outputs the relative coordinates of 21 points with respect to the top-left vertex of the hand. In obj.points, the last 21x2 values are arranged as x0y0x1y1...x20y20. Value from 0 to obj.w.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use. Object's points value format: box_topleft_x, box_topleft_y, box_topright_x, box_topright_y, box_bottomright_x, box_bottomright_y， box_bottomleft_x, box_bottomleft_y, x0, y0, z1, x1, y1, z2, ..., x20, y20, z20. If landmarks_rel is True, will be box_topleft_x, box_topleft_y...,x20,y20,z20,x0_rel,y0_rel,...,x20_rel,y20_rel. Z is depth, the larger the value, the farther away from the palm, and the positive value means closer to the camera.
static	False

C++ defination code:

nn::Objects *detect(image::Image &img, float conf_th = 0.7, float iou_th = 0.45, float conf_th2 = 0.8, bool landmarks_rel = false)

input_size

def input_size(self, detect: bool = True) -> maix.image.Size

Get model input size

item	description
type	func
param	detect: detect or landmarks model, default true.
return	model input size
static	False

C++ defination code:
image::Size input_size(bool detect = true)

input_width

def input_width(self, detect: bool = True) -> int

Get model input width

item	description
type	func
param	detect: detect or landmarks model, default true.
return	model input size of width
static	False

C++ defination code:
int input_width(bool detect = true)

input_height

def input_height(self, detect: bool = True) -> int

Get model input height

item	description
type	func
param	detect: detect or landmarks model, default true.
return	model input size of height
static	False

C++ defination code:
int input_height(bool detect = true)

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

draw_hand

def draw_hand(self, img: maix.image.Image, leftright: int, points: list[int], r_min: int = 4, r_max: int = 10, box: bool = True, box_thickness: int = 1, box_color_l: maix.image.Color = ..., box_color_r: maix.image.Color = ...) -> None

Draw hand and landmarks on image

item	description
type	func
param	img: image object, maix.image.Image type. leftright,: 0 means left, 1 means right points: points result from detect method: box_topleft_x, box_topleft_y, box_topright_x, box_topright_y, box_bottomright_x, box_bottomright_y， box_bottomleft_x, box_bottomleft_y, x0, y0, z1, x1, y1, z2, ..., x20, y20, z20 r_min: min radius of points. r_max: min radius of points. box: draw box or not, default true. box_color: color of box.
static	False

C++ defination code:

void draw_hand(image::Image &img, int leftright, const std::vector<int> &points, int r_min = 4, int r_max = 10, bool box = true, int box_thickness = 1, image::Color box_color_l = image::COLOR_RED, image::Color box_color_r = image::COLOR_GREEN)

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

label_path

Label file path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string label_path

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

FaceLandmarksObject

FaceLandmarksObject class

C++ defination code:
class FaceLandmarksObject

init

def __init__(self) -> None

Valid or not(score > conf_th when detect).

item	description
type	func
static	False

C++ defination code:
FaceLandmarksObject()

valid

Valid or not(score > conf_th when detect).

item	description
type	var
static	False
readonly	False

C++ defination code:
bool valid

score

whether face in image score, value from 0 to 1.0.

item	description
type	var
static	False
readonly	False

C++ defination code:
float score

points

landmarks points, format: x0, y0, ..., xn-1, yn-1.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> points

points_z

landmarks points, format: z0, z1, ..., zn-1.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> points_z

FaceLandmarks

FaceLandmarks class

C++ defination code:
class FaceLandmarks

init

def __init__(self, model: str = '') -> None

Constructor of FaceLandmarks class

item	description
type	func
param	model: model path, default empty, you can load model later by load function.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:
FaceLandmarks(const string &model = "")

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.5, landmarks_abs: bool = True, landmarks_rel: bool = False) -> FaceLandmarksObject

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Hand detect confidence threshold, default 0.7. landmarks_rel: outputs the relative coordinates of 21 points with respect to the top-left vertex of the hand. In obj.points, the last 21x2 values are arranged as x0y0x1y1...x20y20. Value from 0 to obj.w.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use. Object's points value format: box_topleft_x, box_topleft_y, box_topright_x, box_topright_y, box_bottomright_x, box_bottomright_y， box_bottomleft_x, box_bottomleft_y, x0, y0, z1, x1, y1, z2, ..., x20, y20, z20. If landmarks_rel is True, will be box_topleft_x, box_topleft_y...,x20,y20,z20,x0_rel,y0_rel,...,x20_rel,y20_rel. Z is depth, the larger the value, the farther away from the palm, and the positive value means closer to the camera.
static	False

C++ defination code:

nn::FaceLandmarksObject *detect(image::Image &img, float conf_th = 0.5, bool landmarks_abs = true, bool landmarks_rel = false)

crop_image

def crop_image(self, img: maix.image.Image, x: int, y: int, w: int, h: int, points: list[int], new_width: int = -1, new_height: int = -1, scale: float = 1.2) -> maix.image.Image

Crop image from source image by 2 points(2 eyes)

item	description
type	func
param	x,y,w,h: face rectangle, x,y is left-top point. img: source image points: 2 points, eye_left_x, eye_left_y, eye_right_x, eye_right_y scale: crop size scale relative to rectangle's max side length(w or h), final value is `scale *max(w, h)`,default 1.2.
static	False

C++ defination code:

maix::image::Image *crop_image(maix::image::Image &img, int x, int y, int w, int h, std::vector<int> points, int new_width = -1, int new_height = -1, float scale = 1.2)

input_size

def input_size(self, detect: bool = True) -> maix.image.Size

Get model input size

item	description
type	func
param	detect: detect or landmarks model, default true.
return	model input size
static	False

C++ defination code:
image::Size input_size(bool detect = true)

input_width

def input_width(self, detect: bool = True) -> int

Get model input width

item	description
type	func
param	detect: detect or landmarks model, default true.
return	model input size of width
static	False

C++ defination code:
int input_width(bool detect = true)

input_height

def input_height(self, detect: bool = True) -> int

Get model input height

item	description
type	func
param	detect: detect or landmarks model, default true.
return	model input size of height
static	False

C++ defination code:
int input_height(bool detect = true)

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

draw_face

def draw_face(self, img: maix.image.Image, points: list[int], num: int, points_z: list[int] = [], r_min: int = 2, r_max: int = 4) -> None

Draw hand and landmarks on image

item	description
type	func
param	img: image object, maix.image.Image type. leftright,: 0 means left, 1 means right points: points result from detect method: x0, y0, x1, y1, ..., xn-1, yn-1. points_z: points result from detect method: z0, z1, ..., zn-1. r_min: min radius of points. r_max: min radius of points.
static	False

C++ defination code:

void draw_face(image::Image &img, const std::vector<int> &points, int num, const std::vector<int> &points_z=std::vector<int>(), int r_min = 2, int r_max = 4)

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

landmarks_num

landmarks number.

item	description
type	var
static	False
readonly	False

C++ defination code:
int landmarks_num

DepthAnything

C++ defination code:
class DepthAnything

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Construct a new DepthAnything object

item	description
type	func
param	model: MUD model path, if empty, will not load model, you can call load() later. if not empty, will load model and will raise err::Exception if load failed. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
static	False

C++ defination code:

DepthAnything(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file, model format is .mud,\nMUD file should contain [extra] section, have key-values:\n- model_type: depth_anything_v2\n- input_type: rgb or bgr\n- mean: 123.675, 116.28, 103.53\n- scale: 0.017124753831663668, 0.01750700280112045, 0.017429193899782137\n- labels: imagenet_classes.txt

item	description
type	func
param	model: MUD model path
return	error code, if load failed, return error code
static	False

C++ defination code:
err::Err load(const string &model)

get_depth

def get_depth(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> maix.tensor.Tensor

Forward model and get raw image depth estimation data.

item	description
type	func
param	img: image, format should match model input_type， or will raise err.Exception fit: image resize fit mode if input image not equal to model' input size, will auto resize to model's input size then detect, and recover to image input size. Default Fit.FIT_CONTAIN, see image.Fit.
throw	If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error.
return	result, a tensor.Tensor object. If in dual_buff mode, value can be None(in Python) or nullptr(in C++) when not ready. In C++, you need to delete it after use.
static	False

C++ defination code:

tensor::Tensor *get_depth(image::Image &img, image::Fit fit = image::FIT_CONTAIN)

get_depth_image

def get_depth_image(self, img: maix.image.Image, fit: maix.image.Fit = ..., cmap: maix.image.CMap = ...) -> maix.image.Image

Forward model and get image depth estimation data normlized to [0, 255] and as a image.Image object.

item	description
type	func
param	img: image, format should match model input_type， or will raise err.Exception fit: image resize fit mode if input image not equal to model' input size, will auto resize to model's input size then detect, and recover to image input size. Default Fit.FIT_CONTAIN, see image.Fit. cmap: Color map used convert grayscale distance estimation image to RGB image. Diiferent cmap will influence finally image. Default image.CMap.INFERNO.
throw	If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error.
return	result, a image::Image object. If in dual_buff mode, value can be None(in Python) or nullptr(in C++) when not ready. In C++, you need to delete it after use.
static	False

C++ defination code:

image::Image *get_depth_image(image::Image &img, image::Fit fit = image::FIT_CONTAIN, image::CMap cmap = image::CMap::INFERNO)

input_size

def input_size(self) -> maix.image.Size

Get model input size, only for image input

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width, only for image input

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height, only for image input

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format, only for image input

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

input_shape

def input_shape(self) -> list[int]

Get input shape, if have multiple input, only return first input shape

item	description
type	func
return	input shape, list type
static	False

C++ defination code:
std::vector<int> input_shape()

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

Whisper

Whiper class

C++ defination code:
class Whisper

init

def __init__(self, model: str = '', language: str = 'zh') -> None

Constructor of Whisper class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. language: language code, default "zh", supported language code: "zh", "en"
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

Whisper(const string &model = "", std::string language = "zh")

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

transcribe

def transcribe(self, file: str) -> str

Transcribe audio file to text

item	description
type	func
note	If the wav file has multiple channels, only the first channel will be used.
param	file: Pass in an audio file, supporting files in WAV format.
return	The output result after automatic speech recognition.
static	False

C++ defination code:
std::string transcribe(std::string &file)

transcribe_raw

def transcribe_raw(self, pcm: maix.Bytes(bytes), sample_rate: int = 16000, channels: int = 1, bits_per_frame: int = 16) -> str

Transcribe pcm data to text

item	description
type	func
param	pcm: RAW data
return	The output result after automatic speech recognition.
static	False

C++ defination code:

std::string transcribe_raw(Bytes *pcm, int sample_rate = 16000, int channels = 1, int bits_per_frame = 16)

input_pcm_samplerate

def input_pcm_samplerate(self) -> int

Get input pcm samplerate

item	description
type	func
return	input pcm samplerate
static	False

C++ defination code:
int input_pcm_samplerate()

input_pcm_channels

def input_pcm_channels(self) -> int

Get input pcm channels

item	description
type	func
return	input pcm channels
static	False

C++ defination code:
int input_pcm_channels()

input_pcm_bits_per_frame

def input_pcm_bits_per_frame(self) -> int

Get input pcm bits per frame

item	description
type	func
return	input pcm bits per frame
static	False

C++ defination code:
int input_pcm_bits_per_frame()

SelfLearnClassifier

C++ defination code:
class SelfLearnClassifier

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Construct a new SelfLearnClassifier object

item	description
type	func
param	model: MUD model path, if empty, will not load model, you can call load_model() later. if not empty, will load model and will raise err::Exception if load failed. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
static	False

C++ defination code:

SelfLearnClassifier(const std::string &model = "", bool dual_buff = true)

load_model

def load_model(self, model: str) -> maix.err.Err

Load model from file, model format is .mud,\nMUD file should contain [extra] section, have key-values:\n- model_type: classifier_no_top\n- input_type: rgb or bgr\n- mean: 123.675, 116.28, 103.53\n- scale: 0.017124753831663668, 0.01750700280112045, 0.017429193899782137

item	description
type	func
param	model: MUD model path
return	error code, if load failed, return error code
static	False

C++ defination code:
err::Err load_model(const string &model)

classify

def classify(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> list[tuple[int, float]]

Classify image

item	description
type	func
param	img: image, format should match model input_type， or will raise err.Exception fit: image resize fit mode, default Fit.FIT_COVER, see image.Fit.
throw	If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error.
return	result, a list of (idx, distance), smaller distance means more similar. In C++, you need to delete it after use.
static	False

C++ defination code:

std::vector<std::pair<int, float>> *classify(image::Image &img, image::Fit fit = image::FIT_COVER)

add_class

def add_class(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> None

Add a class to recognize

item	description
type	func
param	img: Add a image as a new class fit: image resize fit mode, default Fit.FIT_COVER, see image.Fit.
static	False

C++ defination code:

void add_class(image::Image &img, image::Fit fit = image::FIT_COVER)

class_num

def class_num(self) -> int

Get class number

item	description
type	func
static	False

C++ defination code:
int class_num()

rm_class

def rm_class(self, idx: int) -> maix.err.Err

Remove a class

item	description
type	func
param	idx: index, value from 0 to class_num();
static	False

C++ defination code:
err::Err rm_class(int idx)

add_sample

def add_sample(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> None

Add sample, you should call learn method after add some samples to learn classes.\nSample image can be any of classes we already added.

item	description
type	func
param	img: Add a image as a new sample.
static	False

C++ defination code:

void add_sample(image::Image &img, image::Fit fit = image::FIT_COVER)

rm_sample

def rm_sample(self, idx: int) -> maix.err.Err

Remove a sample

item	description
type	func
param	idx: index, value from 0 to sample_num();
static	False

C++ defination code:
err::Err rm_sample(int idx)

sample_num

def sample_num(self) -> int

Get sample number

item	description
type	func
static	False

C++ defination code:
int sample_num()

learn

def learn(self) -> int

Start auto learn class features from classes image and samples.\nYou should call this method after you add some samples.

item	description
type	func
return	learn epoch(times), 0 means learn nothing.
static	False

C++ defination code:
int learn()

clear

def clear(self) -> None

Clear all class and samples

item	description
type	func
static	False

C++ defination code:
void clear()

input_size

def input_size(self) -> maix.image.Size

Get model input size, only for image input

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width, only for image input

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height, only for image input

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format, only for image input

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

input_shape

def input_shape(self) -> list[int]

Get input shape, if have multiple input, only return first input shape

item	description
type	func
return	input shape, list type
static	False

C++ defination code:
std::vector<int> input_shape()

save

def save(self, path: str, labels: list[str] = []) -> maix.err.Err

Save features and labels to a binary file

item	description
type	func
param	path: file path to save, e.g. /root/my_classes.bin labels: class labels, can be None, or length must equal to class num, or will return err::Err
return	maix.err.Err if labels exists but length not equal to class num, or save file failed, or class num is 0.
static	False

C++ defination code:

err::Err save(const std::string &path, const std::vector<std::string> &labels = std::vector<std::string>())

load

def load(self, path: str) -> list[str]

Load features info from binary file

item	description
type	func
param	path: feature info binary file path, e.g. /root/my_classes.bin
static	False

C++ defination code:

std::vector<std::string> load(const std::string &path)

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

label_path

Label file path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string label_path

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

YOLO11

YOLO11 class

C++ defination code:
class YOLO11

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Constructor of YOLO11 class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

YOLO11(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., keypoint_th: float = 0.5, sort: int = 0) -> Objects

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. keypoint_th: keypoint threshold, default 0.5, only for yolo11-pose model. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use. If model is yolo11-pose, object's points have value, and if points' value < 0 means that point is invalid(conf < keypoint_th).
static	False

C++ defination code:

nn::Objects *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, float keypoint_th = 0.5, int sort = 0)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

draw_pose

def draw_pose(self, img: maix.image.Image, points: list[int], radius: int = 4, color: maix.image.Color = ..., colors: list[maix.image.Color] = [], body: bool = True, close: bool = False) -> None

Draw pose keypoints on image

item	description
type	func
param	img: image object, maix.image.Image type. points: keypoits, int list type, [x, y, x, y ...] radius: radius of points. color: color of points. colors: assign colors for points, list type, element is image.Color object. body: true, if points' length is 172 and body is ture, will draw lines as human body, if set to false won't draw lines, default true. close*: connect all points to close a polygon, default false.
static	False

C++ defination code:

void draw_pose(image::Image &img, std::vector<int> points, int radius = 4, image::Color color = image::COLOR_RED, const std::vector<image::Color> &colors = std::vector<image::Color>(), bool body = true, bool close = false)

draw_seg_mask

def draw_seg_mask(self, img: maix.image.Image, x: int, y: int, seg_mask: maix.image.Image, threshold: int = 127) -> None

Draw segmentation on image

item	description
type	func
param	img: image object, maix.image.Image type. seg_mask: segmentation mask image by detect method, a grayscale image threshold: only mask's value > threshold will be draw on image, value from 0 to 255.
static	False

C++ defination code:

void draw_seg_mask(image::Image &img, int x, int y, image::Image &seg_mask, int threshold = 127)

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

label_path

Label file path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string label_path

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

Retinaface

Retinaface class

C++ defination code:
class Retinaface

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Constructor of Retinaface class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

Retinaface(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.4, iou_th: float = 0.45, fit: maix.image.Fit = ...) -> list[Object]

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.4. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use.
static	False

C++ defination code:

std::vector<nn::Object> *detect(image::Image &img, float conf_th = 0.4, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

YOLOWorld

YOLOWorld class

C++ defination code:
class YOLOWorld

init

def __init__(self, model: str = '', text_feature: str = '', labels: str = '', dual_buff: bool = True) -> None

Constructor of YOLOWorld class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. text_feature: class text feature path, more info refer to load method, default empty, you can load class text feature later by load function. labels: Class labels or labels file path. If string class labels: labels split by comma, e.g. "person, car, cat". If file path: labels file path, each line is a label. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

YOLOWorld(const string &model = "", const string &text_feature = "", const string &labels = "", bool dual_buff = true)

load

def load(self, model: str, text_feature: str, labels: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load text_feature: Class text feature bin file path. labels: Class labels or labels file path. If string class labels: labels split by comma, e.g. "person, car, cat". If file path: labels file path, each line is a label.
return	err::Err
static	False

C++ defination code:

err::Err load(const string &model, const string &text_feature, const string &labels)

learn_text_feature

def learn_text_feature(model: str, labels: list[str], feature_path: str, labels_path: str) -> maix.err.Err

Set detector class labels dynamically, will generate class text feature and save to text_feature path set in load method or constructor.

item	description
type	func
param	labels: class labels you want to recognize, list type. e.g. ["person", "car", "cat"]
return	err::Err
static	True

C++ defination code:

static err::Err learn_text_feature(const std::string &model, std::vector<std::string> labels, const std::string &feature_path, const std::string &labels_path)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., sort: int = 0) -> Objects

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use.
static	False

C++ defination code:

nn::Objects *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, int sort = 0)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

YOLOv5

YOLOv5 class

C++ defination code:
class YOLOv5

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Constructor of YOLOv5 class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

YOLOv5(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., sort: int = 0) -> list[Object]

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use.
static	False

C++ defination code:

std::vector<nn::Object> *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, int sort = 0)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

label_path

Label file path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string label_path

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

anchors

Get anchors

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> anchors

Classifier

C++ defination code:
class Classifier

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Construct a new Classifier object

item	description
type	func
param	model: MUD model path, if empty, will not load model, you can call load() later. if not empty, will load model and will raise err::Exception if load failed. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
static	False

C++ defination code:

Classifier(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file, model format is .mud,\nMUD file should contain [extra] section, have key-values:\n- model_type: classifier\n- input_type: rgb or bgr\n- mean: 123.675, 116.28, 103.53\n- scale: 0.017124753831663668, 0.01750700280112045, 0.017429193899782137\n- labels: imagenet_classes.txt

item	description
type	func
param	model: MUD model path
return	error code, if load failed, return error code
static	False

C++ defination code:
err::Err load(const string &model)

classify

def classify(self, img: maix.image.Image, softmax: bool = True, fit: maix.image.Fit = ...) -> list[tuple[int, float]]

Forward image to model, get result. Only for image input, use classify_raw for tensor input.

item	description
type	func
param	img: image, format should match model input_type， or will raise err.Exception softmax: if true, will do softmax to result, or will return raw value fit: image resize fit mode, default Fit.FIT_COVER, see image.Fit.
throw	If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error.
return	result, a list of (label, score). If in dual_buff mode, value can be one element list and score is zero when not ready. In C++, you need to delete it after use.
static	False

C++ defination code:

std::vector<std::pair<int, float>> *classify(image::Image &img, bool softmax = true, image::Fit fit = image::FIT_COVER)

classify_raw

def classify_raw(self, data: maix.tensor.Tensor, softmax: bool = True) -> list[tuple[int, float]]

Forward tensor data to model, get result

item	description
type	func
param	data: tensor data, format should match model input_type， or will raise err.Excetion softmax: if true, will do softmax to result, or will return raw value
throw	If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error.
return	result, a list of (label, score). In C++, you need to delete it after use.
static	False

C++ defination code:

std::vector<std::pair<int, float>> *classify_raw(tensor::Tensor &data, bool softmax = true)

input_size

def input_size(self) -> maix.image.Size

Get model input size, only for image input

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width, only for image input

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height, only for image input

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format, only for image input

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

input_shape

def input_shape(self) -> list[int]

Get input shape, if have multiple input, only return first input shape

item	description
type	func
return	input shape, list type
static	False

C++ defination code:
std::vector<int> input_shape()

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

label_path

Label file path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string label_path

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

PP_OCR

PP_OCR class

C++ defination code:
class PP_OCR

init

def __init__(self, model: str = '') -> None

Constructor of PP_OCR class

item	description
type	func
param	model: model path, default empty, you can load model later by load function.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:
PP_OCR(const string &model = "")

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, thresh: float = 0.3, box_thresh: float = 0.6, fit: maix.image.Fit = ..., char_box: bool = False) -> ...

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. thresh: Confidence threshold where pixels have charactor, default 0.3. box_thresh: Box threshold, the box prob higher than this value will be valid, default 0.6. fit: Resize method, default image.Fit.FIT_CONTAIN. char_box: Calculate every charactor's box, default false, if true then you can get charactor's box by nn.OCR_Object's char_boxes attribute.
throw	If image format not match model input format or no memory, will throw err::Exception.
return	nn.OCR_Objects type. In C++, you should delete it after use.
static	False

C++ defination code:

nn::OCR_Objects *detect(image::Image &img, float thresh = 0.3, float box_thresh = 0.6, maix::image::Fit fit = maix::image::FIT_CONTAIN, bool char_box = false)

recognize

def recognize(self, img: maix.image.Image, box_points: list[int] = []) -> ...

Only recognize, not detect

item	description
type	func
param	img: image to recognize chractors, can be a stanrd cropped charactors image, if crop image not standard, you can use box_points to assgin where the charactors' 4 corner is. box_points: list type, length must be 8 or 0, default empty means not transfer image to standard image. 4 points postiion, format: [x1, y1, x2, y2, x3, y3, x4, y4], point 1 at the left-top, point 2 right-top... char_box: Calculate every charactor's box, default false, if true then you can get charactor's box by nn.OCR_Object's char_boxes attribute.
static	False

C++ defination code:

nn::OCR_Object *recognize(image::Image &img, const std::vector<int> &box_points = std::vector<int>())

draw_seg_mask

def draw_seg_mask(self, img: maix.image.Image, x: int, y: int, seg_mask: maix.image.Image, threshold: int = 127) -> None

Draw segmentation on image

item	description
type	func
param	img: image object, maix.image.Image type. seg_mask: segmentation mask image by detect method, a grayscale image threshold: only mask's value > threshold will be draw on image, value from 0 to 255.
static	False

C++ defination code:

void draw_seg_mask(image::Image &img, int x, int y, image::Image &seg_mask, int threshold = 127)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

rec_mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> rec_mean

rec_scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> rec_scale

labels

labels (charactors)

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<std::string> labels

det

model have detect model

item	description
type	var
static	False
readonly	False

C++ defination code:
bool det

rec

model have recognize model

item	description
type	var
static	False
readonly	False

C++ defination code:
bool rec

YOLOv8

YOLOv8 class

C++ defination code:
class YOLOv8 : public YOLO11

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Constructor of YOLOv8 class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

YOLOv8(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., keypoint_th: float = 0.5, sort: int = 0) -> Objects

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. keypoint_th: keypoint threshold, default 0.5, only for yolov8-pose model. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use. If model is yolov8-pose, object's points have value, and if points' value < 0 means that point is invalid(conf < keypoint_th).
static	False

C++ defination code:

nn::Objects *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, float keypoint_th = 0.5, int sort = 0)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

draw_pose

def draw_pose(self, img: maix.image.Image, points: list[int], radius: int = 4, color: maix.image.Color = ..., colors: list[maix.image.Color] = [], body: bool = True, close: bool = False) -> None

Draw pose keypoints on image

item	description
type	func
param	img: image object, maix.image.Image type. points: keypoits, int list type, [x, y, x, y ...] radius: radius of points. color: color of points. colors: assign colors for points, list type, element is image.Color object. body: true, if points' length is 172 and body is ture, will draw lines as human body, if set to false won't draw lines, default true. close*: connect all points to close a polygon, default false.
static	False

C++ defination code:

void draw_pose(image::Image &img, std::vector<int> points, int radius = 4, image::Color color = image::COLOR_RED, const std::vector<image::Color> &colors = std::vector<image::Color>(), bool body = true, bool close = false)

draw_seg_mask

def draw_seg_mask(self, img: maix.image.Image, x: int, y: int, seg_mask: maix.image.Image, threshold: int = 127) -> None

Draw segmentation on image

item	description
type	func
param	img: image object, maix.image.Image type. seg_mask: segmentation mask image by detect method, a grayscale image threshold: only mask's value > threshold will be draw on image, value from 0 to 255.
static	False

C++ defination code:

void draw_seg_mask(image::Image &img, int x, int y, image::Image &seg_mask, int threshold = 127)

labels

Labels list

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<string> labels

label_path

Label file path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string label_path

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

MeloTTS

MeloTTS class

C++ defination code:
class MeloTTS

init

def __init__(self, model: str = '', language: str = 'zh', speed: float = 0.800000011920929, noise_scale: float = 0.30000001192092896, noise_scale_w: float = 0.6000000238418579, sdp_ratio: float = 0.20000000298023224) -> None

Constructor of MeloTTS class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. language: language code, default "zh", supported language code: "zh" speed: the speech rate of the audio is controlled by this value,lower values result in slower reading speed. default is 0.8 noise_scale: this parameter controls the randomness in speech. increasing the value results in more varied and less deterministic speech output.default is 0.3 noise_scale_w: this parameter controls the randomness in speech alignment. while a higher value can enhance naturalness, overly high values may introduce instability or distortion in the audio. default is 0.6 sdp_ratio: the higher the alignment weight, the more natural the speech sounds, but excessive values may result in instability. default is 0.2
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

MeloTTS(const string &model = "", std::string language = "zh", double speed = 0.8f, double noise_scale = 0.3f, double noise_scale_w = 0.6f, double sdp_ratio = 0.2f)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

forward

def forward(*args, **kwargs)

Text to speech

item	description
type	func
param	text: input text path: The output path of the voice file, the default sampling rate is 44100, the number of channels is 1, and the number of sampling bits is 16. default is empty. output_pcm: Enable or disable the output of raw PCM data. The default output sampling rate is 44100, the number of channels is 1, and the sampling depth is 16 bits. default is false.
return	raw PCM data
static	False

C++ defination code:

Bytes *forward(std::string text, std::string path = "", bool output_pcm = false)

samplerate

def samplerate(self) -> int

Get pcm samplerate

item	description
type	func
return	pcm samplerate
static	False

C++ defination code:
int samplerate()

speed

def speed(self) -> float

Get the speed of the text

item	description
type	func
return	text speed
static	False

C++ defination code:
double speed()

OCR_Box

Object for OCR detect box

C++ defination code:
class OCR_Box

init

def __init__(self, x1: int = 0, y1: int = 0, x2: int = 0, y2: int = 0, x3: int = 0, y3: int = 0, x4: int = 0, y4: int = 0) -> None

OCR_Box constructor

item	description
type	func
static	False

C++ defination code:

OCR_Box(int x1 = 0, int y1 = 0, int x2 = 0, int y2 = 0, int x3 = 0, int y3 = 0, int x4 = 0, int y4 = 0)

x1

left top point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int x1

y1

left top point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int y1

x2

right top point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int x2

y2

right top point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int y2

x3

right bottom point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int x3

y3

right bottom point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int y3

x4

left bottom point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int x4

y4

left bottom point of box

item	description
type	var
static	False
readonly	False

C++ defination code:
int y4

to_list

def to_list(self) -> list[int]

convert box point to a list type.

item	description
type	func
return	list type, element is int type, value [x1, y1, x2, y2, x3, y3, x4, y4].
static	False

C++ defination code:
std::vector<int> to_list()

OCR_Object

Object for OCR detect result

C++ defination code:
class OCR_Object

init

def __init__(self, box: OCR_Box, idx_list: list[int], char_list: list[str], score: float = 0, char_pos: list[int] = []) -> None

Constructor of Object for OCR detect result

item	description
type	func
param	score: score
static	False

C++ defination code:

OCR_Object(const nn::OCR_Box &box, const std::vector<int> &idx_list, const std::vector<std::string> &char_list, float score = 0, const std::vector<int> &char_pos = std::vector<int>())

box

OCR_Object box, 4 points box, first point at the left-top, clock-wise.

item	description
type	var
static	False
readonly	False

C++ defination code:
nn::OCR_Box box

score

Object score

item	description
type	var
static	False
readonly	False

C++ defination code:
float score

idx_list

chars' idx list, element is int type.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> idx_list

char_pos

Chars' position relative to left

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> char_pos

char_str

def char_str(self) -> str

Get OCR_Object's charactors, return a string type.

item	description
type	func
return	All charactors in string type.
static	False

C++ defination code:
const std::string &char_str()

char_list

def char_list(self) -> list[str]

Get OCR_Object's charactors, return a list type.

item	description
type	func
return	All charactors in list type.
static	False

C++ defination code:
const std::vector<std::string> &char_list()

update_chars

def update_chars(self, char_list: list[str]) -> None

Set OCR_Object's charactors

item	description
type	func
param	char_list: All charactors in list type.
static	False

C++ defination code:

void update_chars(const std::vector<std::string> &char_list)

str

def __str__(self) -> str

OCR_Object info to string

item	description
type	func
return	OCR_Object info string
static	False

C++ defination code:
std::string to_str()

OCR_Objects

OCR_Objects Class for detect result

C++ defination code:
class OCR_Objects

init

def __init__(self) -> None

Constructor of OCR_Objects class

item	description
type	func
static	False

C++ defination code:
OCR_Objects()

add

def add(self, box: OCR_Box, idx_list: list[int], char_list: list[str], score: float = 0, char_pos: list[int] = []) -> OCR_Object

Add object to objects

item	description
type	func
throw	Throw exception if no memory
static	False

C++ defination code:

nn::OCR_Object &add(const nn::OCR_Box &box, const std::vector<int> &idx_list, const std::vector<std::string> &char_list, float score = 0, const std::vector<int> &char_pos = std::vector<int>())

remove

def remove(self, idx: int) -> maix.err.Err

Remove object form objects

item	description
type	func
static	False

C++ defination code:
err::Err remove(int idx)

at

def at(self, idx: int) -> OCR_Object

Get object item

item	description
type	func
static	False

C++ defination code:
nn::OCR_Object &at(int idx)

getitem

def __getitem__(self, idx: int) -> OCR_Object

Get object item

item	description
type	func
static	False

C++ defination code:
nn::OCR_Object &operator[](int idx)

len

def __len__(self) -> int

Get size

item	description
type	func
static	False

C++ defination code:
size_t size()

iter

def __iter__(self) -> typing.Iterator

Begin

item	description
type	func
static	False

C++ defination code:
std::vector<OCR_Object*>::iterator begin()

MUD

MUD(model universal describe file) class

C++ defination code:
class MUD

init

def __init__(self, model_path: str = '') -> None

MUD constructor

item	description
type	func
param	model_path: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function.
static	False

C++ defination code:
MUD(const std::string &model_path = "")

load

def load(self, model_path: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model_path: direction [in], model file path, model format can be MUD(model universal describe file) file.
return	error code, if load success, return err::ERR_NONE
static	False

C++ defination code:

err::Err load(const std::string &model_path)

type

Model type, string type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string type

items

Model config items, different model type has different config items

item	description
type	var
static	False
readonly	False

C++ defination code:

std::map<std::string, std::map<std::string, std::string>> items

model_path

Model path

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string model_path

parse_labels

def parse_labels(self, key: str = 'labels') -> list[str]

Please load() first, parse labels in items["extra"]["labels"],\nif items["extra"]["labels"] is a file path: will parse file, every one line is a label;\nif items["extra"]["labels"] is a string, labels split by comma(",").\nExecute this method will replace items["extra"]["labels"];

item	description
type	func
param	key: parse from items[key], default "labels".
return	parsed labels list.
static	False

C++ defination code:

std::vector<std::string> parse_labels(const std::string key = "labels")

LayerInfo

NN model layer info

C++ defination code:
class LayerInfo

init

def __init__(self, name: str = '', dtype: maix.tensor.DType = ..., shape: list[int] = []) -> None

LayerInfo constructor

item	description
type	func
param	name: direction [in], layer name dtype: direction [in], layer data type shape: direction [in], layer shape
static	False

C++ defination code:

LayerInfo(const std::string &name =  "", tensor::DType dtype = tensor::DType::FLOAT32, std::vector<int> shape = std::vector<int>())

name

Layer name

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string   name

dtype

Layer data type

item	description
type	var
attention	If model is quantized, this is the real quantized data type like int8 float16, in most scene, inputs and outputs we actually use float32 in API like forward.
static	False
readonly	False

C++ defination code:
tensor::DType dtype

shape

Layer shape

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> shape

layout

If layer layout, if layout is chw, typically the shape is [N, C, H, W], hwc is [N, H, W, C].

item	description
type	var
static	False
readonly	False

C++ defination code:
nn::Layout layout = nn::Layout::UNKNOWN

shape_int

def shape_int(self) -> int

Shape as one int type, multiply all dims of shape

item	description
type	func
static	False

C++ defination code:
int shape_int()

to_str

def to_str(self) -> str

To string

item	description
type	func
static	False

C++ defination code:
std::string to_str()

str

def __str__(self) -> str

To string

item	description
type	func
static	False

C++ defination code:
std::string __str__()

NN

Neural network class

C++ defination code:
class NN

init

def __init__(self, model: str = '', dual_buff: bool = False) -> None

Neural network constructor

item	description
type	func
param	model: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default false to ensure easy use.
static	False

C++ defination code:

NN(const std::string &model = "", bool dual_buff = false)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: direction [in], model file path, model format can be MUD(model universal describe file) file.
return	error code, if load success, return err::ERR_NONE
static	False

C++ defination code:
err::Err load(const std::string &model)

loaded

def loaded(self) -> bool

Is model loaded

item	description
type	func
return	true if model loaded, else false
static	False

C++ defination code:
bool loaded()

set_dual_buff

def set_dual_buff(self, enable: bool) -> None

Enable dual buff or disable dual buff

item	description
type	func
param	enable: true to enable, false to disable
static	False

C++ defination code:
void set_dual_buff(bool enable)

inputs_info

def inputs_info(self) -> list[LayerInfo]

Get model input layer info

item	description
type	func
return	input layer info
static	False

C++ defination code:
std::vector<nn::LayerInfo> inputs_info()

outputs_info

def outputs_info(self) -> list[LayerInfo]

Get model output layer info

item	description
type	func
return	output layer info
static	False

C++ defination code:
std::vector<nn::LayerInfo> outputs_info()

extra_info

def extra_info(self) -> dict[str, str]

Get model extra info define in MUD file

item	description
type	func
return	extra info, dict type, key-value object, attention: key and value are all string type.
static	False

C++ defination code:

std::map<std::string, std::string> extra_info()

extra_info_labels

def extra_info_labels(self) -> list[str]

Get model parsed extra info labels define in MUD file

item	description
type	func
return	labels list in extra info, string list type.
static	False

C++ defination code:

std::vector<std::string> extra_info_labels()

forward

def forward(self, inputs: maix.tensor.Tensors, copy_result: bool = True, dual_buff_wait: bool = False) -> maix.tensor.Tensors

forward run model, get output of model,\nthis is specially for MaixPy, not efficient, but easy to use in MaixPy

item	description
type	func
param	input: direction [in], input tensor copy_result: If set true, will copy result to a new variable; else will use a internal memory, you can only use it until to the next forward. Default true to avoid problems, you can set it to false manually to make speed faster. dual_buff_wait: bool type, only for dual_buff mode, if true, will inference this image and wait for result, default false.
return	output tensor. In C++, you should manually delete tensors in return value and return value. If dual_buff mode, it can be NULL(None in MaixPy) means not ready.
throw	if error ocurrs like no memory or arg error, will raise err.Exception.
static	False

C++ defination code:

tensor::Tensors *forward(tensor::Tensors &inputs, bool copy_result = true, bool dual_buff_wait = false)

forward_image

def forward_image(self, img: maix.image.Image, mean: list[float] = [], scale: list[float] = [], fit: maix.image.Fit = ..., copy_result: bool = True, dual_buff_wait: bool = False, chw: bool = True) -> maix.tensor.Tensors

forward model, param is image

item	description
type	func
param	img: input image mean: mean value, a list type, e.g. [0.485, 0.456, 0.406], default is empty list means not normalize. scale: scale value, a list type, e.g. [1/0.229, 1/0.224, 1/0.225], default is empty list means not normalize. fit: fit mode, if the image size of input not equal to model's input, it will auto resize use this fit method, default is image.Fit.FIT_FILL for easy coordinate calculation, but for more accurate result, use image.Fit.FIT_CONTAIN is better. copy_result: If set true, will copy result to a new variable; else will use a internal memory, you can only use it until to the next forward. Default true to avoid problems, you can set it to false manually to make speed faster. dual_buff_wait: bool type, only for dual_buff mode, if true, will inference this image and wait for result, default false. chw: !!depracated!! This arg will be ignored!!! Please set extra.input_layout in mud file instead. chw channel format, forward model with hwc format image input if set to false, default true(chw).
return	output tensor. In C++, you should manually delete tensors in return value and return value. If dual_buff mode, it can be NULL(None in MaixPy) means not ready.
throw	If error occurs, like arg error or alloc memory failed, will raise err.Exception.
static	False

C++ defination code:

tensor::Tensors *forward_image(image::Image &img, std::vector<float> mean = std::vector<float>(), std::vector<float> scale = std::vector<float>(), image::Fit fit = image::Fit::FIT_FILL, bool copy_result = true, bool dual_buff_wait = false, bool chw = true)

FaceDetector

FaceDetector class

C++ defination code:
class FaceDetector

init

def __init__(self, model: str = '', dual_buff: bool = True) -> None

Constructor of FaceDetector class

item	description
type	func
param	model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

FaceDetector(const string &model = "", bool dual_buff = true)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

detect

def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ...) -> list[Object]

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN.
throw	If image format not match model input format, will throw err::Exception.
return	Object list. In C++, you should delete it after use.
static	False

C++ defination code:

std::vector<nn::Object> *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

NanoTrack

NanoTrack class

C++ defination code:
class NanoTrack

init

def __init__(self, model: str = '') -> None

Constructor of NanoTrack class

item	description
type	func
param	model: model path, default empty, you can load model later by load function.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:
NanoTrack(const string &model = "")

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

init

def init(self, img: maix.image.Image, x: int, y: int, w: int, h: int) -> None

Init tracker, give tacker first target image and target position.

item	description
type	func
param	img: Image want to detect, target should be in this image. x: the target position left top coordinate x. y: the target position left top coordinate y. w: the target width. h: the target height.
throw	If image format not match model input format, will throw err::Exception.
static	False

C++ defination code:

void init(image::Image &img, int x, int y, int w, int h)

track

def track(self, img: maix.image.Image, threshold: float = 0.9) -> Object

Track object acoording to last object position and the init function learned target feature.

item	description
type	func
param	img: image to detect object and track, can be any resolution, before detect it will crop a area according to last time target's position. threshold: If score < threshold, will see this new detection is invalid, but remain return this new detecion, default 0.9.
return	object, position and score, and detect area in points's first 4 element(x, y, w, h, center_x, center_y, input_size, target_size)
static	False

C++ defination code:

nn::Object track(image::Image &img, float threshold = 0.9)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

Speech

C++ defination code:
class Speech

init

def __init__(self, model: str = '') -> None

Construct a new Speech object

item	description
type	func
param	model: model path, default empty, you can load model later by load function.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:
Speech(const string &model = "")

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: Model path want to load
return	err::Err
static	False

C++ defination code:
err::Err load(const string &model)

init

def init(self, dev_type: SpeechDevice, device_name: str = '') -> maix.err.Err

Init the ASR library and select the type and name of the audio device.

item	description
type	func
param	dev_type: device type want to detect, can choose between WAV, PCM, or MIC. device_name: device name want to detect, can choose a WAV file, a PCM file, or a MIC device name.
throw	1. If am model is not loaded, will throw err::ERR_NOT_IMPL. 2. If device is not supported, will throw err::ERR_NOT_IMPL.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err init(nn::SpeechDevice dev_type, const string &device_name = "")

devive

def devive(self, dev_type: SpeechDevice, device_name: str) -> maix.err.Err

Reset the device, usually used for PCM/WAV recognition,\nsuch as identifying the next WAV file.

item	description
type	func
param	dev_type: device type want to detect, can choose between WAV, PCM, or MIC. device_name: device name want to detect, can choose a WAV file, a PCM file, or a MIC device name.
throw	If device is not supported, will throw err::ERR_NOT_IMPL.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err devive(nn::SpeechDevice dev_type, const string &device_name)

dec_deinit

def dec_deinit(self, decoder: SpeechDecoder) -> None

Deinit the decoder.

item	description
type	func
param	decoder: decoder type want to deinit can choose between DECODER_RAW, DECODER_DIG, DECODER_LVCSR, DECODER_KWS or DECODER_ALL.
throw	If device is not supported, will throw err::ERR_NOT_IMPL.
static	False

C++ defination code:
void dec_deinit(nn::SpeechDecoder decoder)

raw

def raw(self, callback: typing.Callable[[list[tuple[int, float]], int], None]) -> maix.err.Err

Init raw decoder, it will output the prediction results of the original AM.

item	description
type	func
param	callback: raw decoder user callback.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err raw(std::function<void(std::vector<std::pair<int, float>>, int)> callback)

digit

def digit(self, blank: int, callback: typing.Callable[[str, int], None]) -> maix.err.Err

Init digit decoder, it will output the Chinese digit recognition results within the last 4 seconds.

item	description
type	func
param	blank: If it exceeds this value, insert a '_' in the output result to indicate idle mute. callback: digit decoder user callback.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err digit(int blank, std::function<void(char*, int)> callback)

kws

def kws(self, kw_tbl: list[str], kw_gate: list[float], callback: typing.Callable[[list[float], int], None], auto_similar: bool = True) -> maix.err.Err

Init kws decoder, it will output a probability list of all registered keywords in the latest frame,\nusers can set their own thresholds for wake-up.

item	description
type	func
param	kw_tbl: Keyword list, filled in with spaces separated by pinyin, for example: xiao3 ai4 tong2 xue2 kw_gate: kw_gate, keyword probability gate table, the number should be the same as kw_tbl auto_similar: Whether to perform automatic homophone processing, setting it to true will automatically calculate the probability by using pinyin with different tones as homophones callback: digit decoder user callback.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err kws(std::vector<string> kw_tbl, std::vector<float> kw_gate, std::function<void(std::vector<float>, int)> callback, bool auto_similar = true)

lvcsr

def lvcsr(self, sfst_name: str, sym_name: str, phones_txt: str, words_txt: str, callback: typing.Callable[[tuple[str, str], int], None], beam: float = 8, bg_prob: float = 10, scale: float = 0.5, mmap: bool = False) -> maix.err.Err

Init lvcsr decoder, it will output continuous speech recognition results (less than 1024 Chinese characters).

item	description
type	func
param	sfst_name: Sfst file path. sym_name: Sym file path (output symbol table). phones_txt: Path to phones.bin (pinyin table). words_txt: Path to words.bin (dictionary table). callback: lvcsr decoder user callback. beam: The beam size for WFST search is set to 8 by default, and it is recommended to be between 3 and 9. The larger the size, the larger the search space, and the more accurate but slower the search. bg_prob: The absolute value of the natural logarithm of the default probability value for background pinyin outside of BEAM-CNT is set to 10 by default. scale: acoustics_cost = log(pny_prob)scale. mmap*: use mmap to load the WFST decoding image, If set to true, the beam should be less than 5.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err lvcsr(const string &sfst_name, const string &sym_name,
                       const string &phones_txt, const string &words_txt, 
                       std::function<void(std::pair<char*, char*>, int)> callback,
                       float beam = 8, float bg_prob = 10, float scale = 0.5, bool mmap = false)

run

def run(self, frame: int) -> int

Run speech recognition, user can run 1 frame at a time and do other processing after running,\nor it can run continuously within a thread and be stopped by an external thread.

item	description
type	func
param	frame: The number of frames per run.
return	int type, return actual number of frames in the run.
static	False

C++ defination code:
int run(int frame)

clear

def clear(self) -> None

Reset internal cache operation

item	description
type	func
static	False

C++ defination code:
void clear()

frame_time

def frame_time(self) -> int

Get the time of one frame.

item	description
type	func
return	int type, return the time of one frame.
static	False

C++ defination code:
int frame_time()

similar

def similar(self, pny: str, similar_pnys: list[str]) -> maix.err.Err

Manually register mute words, and each pinyin can register up to 10 homophones,\nplease note that using this interface to register homophones will overwrite,\nthe homophone table automatically generated in the "automatic homophone processing" feature.

item	description
type	func
param	dev_type: device type want to detect, can choose between WAV, PCM, or MIC. device_name: device name want to detect, can choose a WAV file, a PCM file, or a MIC device name.
return	err::Err type, if init success, return err::ERR_NONE
static	False

C++ defination code:

err::Err similar(const string &pny, std::vector<std::string> similar_pnys)

skip_frames

def skip_frames(self, num: int) -> None

Run some frames and drop, this can be used to avoid\nincorrect recognition results when switching decoders.

item	description
type	func
param	num: number of frames to run and drop
static	False

C++ defination code:
void skip_frames(int num)

mean

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean

scale

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale

dev_type

def dev_type(self) -> SpeechDevice

get device type

item	description
type	func
return	nn::SpeechDevice type, see SpeechDevice of this module
static	False

C++ defination code:
nn::SpeechDevice dev_type()

FaceObject

Face object

C++ defination code:
class FaceObject

init

def __init__(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], feature: list[float] = [], face: maix.image.Image = ...) -> None

Constructor

item	description
type	func
static	False

C++ defination code:

FaceObject(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), std::vector<float> feature = std::vector<float>(), image::Image face = image::Image())

str

def __str__(self) -> str

FaceObject info to string

item	description
type	func
return	FaceObject info string
static	False

C++ defination code:
std::string to_str()

x

FaceObject left top coordinate x

item	description
type	var
static	False
readonly	False

C++ defination code:
int x

y

FaceObject left top coordinate y

item	description
type	var
static	False
readonly	False

C++ defination code:
int y

w

FaceObject width

item	description
type	var
static	False
readonly	False

C++ defination code:
int w

h

FaceObject height

item	description
type	var
static	False
readonly	False

C++ defination code:
int h

class_id

FaceObject class id

item	description
type	var
static	False
readonly	False

C++ defination code:
int class_id

score

FaceObject score

item	description
type	var
static	False
readonly	False

C++ defination code:
float score

points

keypoints

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<int> points

feature

feature, float list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> feature

face

face image

item	description
type	var
static	False
readonly	False

C++ defination code:
image::Image face

FaceObjects

Objects Class for detect result

C++ defination code:
class FaceObjects

init

def __init__(self) -> None

Constructor of FaceObjects class

item	description
type	func
static	False

C++ defination code:
FaceObjects()

add

def add(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], feature: list[float] = [], face: maix.image.Image = ...) -> FaceObject

Add object to FaceObjects

item	description
type	func
throw	Throw exception if no memory
static	False

C++ defination code:

nn::FaceObject &add(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), std::vector<float> feature = std::vector<float>(), image::Image face = image::Image())

remove

def remove(self, idx: int) -> maix.err.Err

Remove object form FaceObjects

item	description
type	func
static	False

C++ defination code:
err::Err remove(int idx)

at

def at(self, idx: int) -> FaceObject

Get object item

item	description
type	func
static	False

C++ defination code:
nn::FaceObject &at(int idx)

getitem

def __getitem__(self, idx: int) -> FaceObject

Get object item

item	description
type	func
static	False

C++ defination code:
nn::FaceObject &operator[](int idx)

len

def __len__(self) -> int

Get size

item	description
type	func
static	False

C++ defination code:
size_t size()

iter

def __iter__(self) -> typing.Iterator

Begin

item	description
type	func
static	False

C++ defination code:
std::vector<FaceObject*>::iterator begin()

FaceRecognizer

FaceRecognizer class

C++ defination code:
class FaceRecognizer

init

def __init__(self, detect_model: str = '', feature_model: str = '', dual_buff: bool = True) -> None

Constructor of FaceRecognizer class

item	description
type	func
param	detect_model: face detect model path, default empty, you can load model later by load function. feature_model: feature extract model dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed.
throw	If model arg is not empty and load failed, will throw err::Exception.
static	False

C++ defination code:

FaceRecognizer(const string &detect_model = "", const string &feature_model = "", bool dual_buff = true)

load

def load(self, detect_model: str, feature_model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	detect_model: face detect model path, default empty, you can load model later by load function. feature_model: feature extract model
return	err::Err
static	False

C++ defination code:

err::Err load(const string &detect_model, const string &feature_model)

recognize

def recognize(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, compare_th: float = 0.8, get_feature: bool = False, get_face: bool = False, fit: maix.image.Fit = ...) -> FaceObjects

Detect objects from image

item	description
type	func
param	img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Detect confidence threshold, default 0.5. iou_th: Detect IoU threshold, default 0.45. compare_th: Compare two face score threshold, default 0.8, if two faces' score < this value, will see this face fas unknown. get_feature: return feature or not, if true will copy features to result, if false will not copy feature to result to save time and memory. get_face: return face image or not, if true result object's face attribute will valid, or face sttribute is empty. Get face image will alloc memory and copy image, so will lead to slower speed. fit: Resize method, default image.Fit.FIT_CONTAIN.
throw	If image format not match model input format, will throw err::Exception.
return	FaceObjects object. In C++, you should delete it after use.
static	False

C++ defination code:

nn::FaceObjects *recognize(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, float compare_th = 0.8, bool get_feature = false, bool get_face = false, maix::image::Fit fit = maix::image::FIT_CONTAIN)

add_face

def add_face(self, face: FaceObject, label: str) -> maix.err.Err

Add face to lib

item	description
type	func
param	face: face object, find by recognize label: face label(name)
static	False

C++ defination code:

err::Err add_face(nn::FaceObject *face, const std::string &label)

remove_face

def remove_face(self, idx: int = -1, label: str = '') -> maix.err.Err

remove face from lib

item	description
type	func
param	idx: index of face in lib, default -1 means use label, value [0,face_num), idx and label must have one, idx have high priotiry. label: which face to remove, default to empty string mean use idx, idx and label must have one, idx have high priotiry.
static	False

C++ defination code:

err::Err remove_face(int idx = -1, const std::string &label = "")

save_faces

def save_faces(self, path: str) -> maix.err.Err

Save faces info to a file

item	description
type	func
param	path: where to save, string type.
return	err.Err type
static	False

C++ defination code:

err::Err save_faces(const std::string &path)

load_faces

def load_faces(self, path: str) -> maix.err.Err

Load faces info from a file

item	description
type	func
param	path: from where to load, string type.
return	err::Err type
static	False

C++ defination code:

err::Err load_faces(const std::string &path)

input_size

def input_size(self) -> maix.image.Size

Get model input size

item	description
type	func
return	model input size
static	False

C++ defination code:
image::Size input_size()

input_width

def input_width(self) -> int

Get model input width

item	description
type	func
return	model input size of width
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Get model input height

item	description
type	func
return	model input size of height
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Get input image format

item	description
type	func
return	input image format, image::Format type.
static	False

C++ defination code:
image::Format input_format()

mean_detector

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean_detector

scale_detector

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale_detector

mean_feature

Get mean value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> mean_feature

scale_feature

Get scale value, list type

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<float> scale_feature

labels

labels, list type, first is "unknown"

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<std::string> labels

features

item	description
type	var
static	False
readonly	False

C++ defination code:
std::vector<std::vector<float>> features

QwenResp

Qwen model response

C++ defination code:
class QwenResp

msg

Model response full message.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string msg

msg_new

Model response new message.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string msg_new

err_code

Model response error code, maix.Err type, should be err.Err.ERR_NONE if no error.

item	description
type	var
static	False
readonly	False

C++ defination code:
err::Err err_code

err_msg

Model response error message.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string err_msg

QwenPostConfig

Qwen model post config

C++ defination code:
class QwenPostConfig

enable_temperature

Enable temperature sampling

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_temperature

temperature

Temperature sampling value

item	description
type	var
static	False
readonly	False

C++ defination code:
float temperature

enable_repetition_penalty

Enable repetition penalty

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_repetition_penalty

repetition_penalty

Repetition penalty value

item	description
type	var
static	False
readonly	False

C++ defination code:
float repetition_penalty

penalty_window

Repetition penalty window

item	description
type	var
static	False
readonly	False

C++ defination code:
int penalty_window

enable_top_p_sampling

Enable diversity penalty

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_top_p_sampling

top_p

Diversity penalty value

item	description
type	var
static	False
readonly	False

C++ defination code:
float top_p

enable_top_k_sampling

Enable top k sampling

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_top_k_sampling

top_k

Top k sampling value

item	description
type	var
static	False
readonly	False

C++ defination code:
int top_k

Qwen

Qwen model

C++ defination code:
class Qwen

init

def __init__(self, model: str) -> None

Qwen constructor

item	description
type	func
param	model: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function.
static	False

C++ defination code:
Qwen(const std::string &model)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: direction [in], model file path, model format can be MUD(model universal describe file) file.
return	error code, if load success, return err::ERR_NONE
static	False

C++ defination code:
err::Err load(const std::string &model)

unload

def unload(self) -> maix.err.Err

Unload model

item	description
type	func
return	error code, if unload success, return err::ERR_NONE
static	False

C++ defination code:
err::Err unload()

loaded

def loaded(self) -> bool

Is model loaded

item	description
type	func
return	true if model loaded, else false
static	False

C++ defination code:
bool loaded()

set_system_prompt

def set_system_prompt(self, prompt: str) -> None

Set system prompt, will auto call clear_context.

item	description
type	func
param	prompt: system prompt
static	False

C++ defination code:

void set_system_prompt(const std::string &prompt)

get_system_prompt

def get_system_prompt(self) -> str

Get system prompt

item	description
type	func
return	system prompt
static	False

C++ defination code:
std::string get_system_prompt()

set_log_level

def set_log_level(self, level: ..., color: bool) -> None

Set log level

item	description
type	func
param	level: log level, @see maix.log.LogLevel color: true to enable color, false to disable color
static	False

C++ defination code:

void set_log_level(log::LogLevel level, bool color)

set_reply_callback

def set_reply_callback(self, callback: typing.Callable[[Qwen, QwenResp], None] = None) -> None

Set reply callback

item	description
type	func
param	callback: reply callback, when token(words) generated, this function will be called, so you can get response message in real time in this callback funtion. If set to None(nullptr in C++), you can get response after all response message generated.
static	False

C++ defination code:

void set_reply_callback(std::function<void(nn::Qwen &, const nn::QwenResp &)> callback = nullptr)

get_reply_callback

def get_reply_callback(self) -> typing.Callable[[Qwen, QwenResp], None]

Get reply callback

item	description
type	func
return	reply callback
static	False

C++ defination code:

std::function<void(nn::Qwen &, const nn::QwenResp &)> get_reply_callback()

send

def send(self, msg: str) -> QwenResp

Send message to model

item	description
type	func
param	msg: message to send
return	model response
static	False

C++ defination code:
nn::QwenResp send(const std::string &msg)

cancel

def cancel(self) -> None

Cancel running

item	description
type	func
static	False

C++ defination code:
void cancel()

clear_context

def clear_context(self) -> maix.err.Err

Clear context

item	description
type	func
return	error code, if clear success, return err::ERR_NONE
static	False

C++ defination code:
err::Err clear_context()

version

def version(self) -> str

Get model version

item	description
type	func
return	model version
static	False

C++ defination code:
std::string version()

post_config

Qwen post config, default will read config from model mud file, you can also set it manually here.

item	description
type	var
static	False
readonly	False

C++ defination code:
nn::QwenPostConfig post_config

InternVLResp

InternVL model response

C++ defination code:
class InternVLResp

msg

Model response full message.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string msg

msg_new

Model response new message.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string msg_new

err_code

Model response error code, maix.Err type, should be err.Err.ERR_NONE if no error.

item	description
type	var
static	False
readonly	False

C++ defination code:
err::Err err_code

err_msg

Model response error message.

item	description
type	var
static	False
readonly	False

C++ defination code:
std::string err_msg

InternVLPostConfig

InternVL model post config

C++ defination code:
class InternVLPostConfig

enable_temperature

Enable temperature sampling

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_temperature

temperature

Temperature sampling value

item	description
type	var
static	False
readonly	False

C++ defination code:
float temperature

enable_repetition_penalty

Enable repetition penalty

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_repetition_penalty

repetition_penalty

Repetition penalty value

item	description
type	var
static	False
readonly	False

C++ defination code:
float repetition_penalty

penalty_window

Repetition penalty window

item	description
type	var
static	False
readonly	False

C++ defination code:
int penalty_window

enable_top_p_sampling

Enable diversity penalty

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_top_p_sampling

top_p

Diversity penalty value

item	description
type	var
static	False
readonly	False

C++ defination code:
float top_p

enable_top_k_sampling

Enable top k sampling

item	description
type	var
static	False
readonly	False

C++ defination code:
bool enable_top_k_sampling

top_k

Top k sampling value

item	description
type	var
static	False
readonly	False

C++ defination code:
int top_k

InternVL

InternVL model

C++ defination code:
class InternVL

init

def __init__(self, model: str) -> None

InternVL constructor

item	description
type	func
param	model: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function.
static	False

C++ defination code:
InternVL(const std::string &model)

load

def load(self, model: str) -> maix.err.Err

Load model from file

item	description
type	func
param	model: direction [in], model file path, model format can be MUD(model universal describe file) file.
return	error code, if load success, return err::ERR_NONE
static	False

C++ defination code:
err::Err load(const std::string &model)

unload

def unload(self) -> maix.err.Err

Unload model

item	description
type	func
return	error code, if unload success, return err::ERR_NONE
static	False

C++ defination code:
err::Err unload()

loaded

def loaded(self) -> bool

Is model loaded

item	description
type	func
return	true if model loaded, else false
static	False

C++ defination code:
bool loaded()

set_system_prompt

def set_system_prompt(self, prompt: str) -> None

Set system prompt

item	description
type	func
param	prompt: system prompt
static	False

C++ defination code:

void set_system_prompt(const std::string &prompt)

get_system_prompt

def get_system_prompt(self) -> str

Get system prompt

item	description
type	func
return	system prompt
static	False

C++ defination code:
std::string get_system_prompt()

set_log_level

def set_log_level(self, level: ..., color: bool) -> None

Set log level

item	description
type	func
param	level: log level, @see maix.log.LogLevel color: true to enable color, false to disable color
static	False

C++ defination code:

void set_log_level(log::LogLevel level, bool color)

set_reply_callback

def set_reply_callback(self, callback: typing.Callable[[InternVL, InternVLResp], None] = None) -> None

Set reply callback.

item	description
type	func
param	callback: reply callback, when token(words) generated, this function will be called, so you can get response message in real time in this callback funtion. If set to None(nullptr in C++), you can get response after all response message generated.
static	False

C++ defination code:

void set_reply_callback(std::function<void(nn::InternVL &, const nn::InternVLResp &)> callback = nullptr)

get_reply_callback

def get_reply_callback(self) -> typing.Callable[[InternVL, InternVLResp], None]

Get reply callback

item	description
type	func
return	reply callback
static	False

C++ defination code:

std::function<void(nn::InternVL &, const nn::InternVLResp &)> get_reply_callback()

input_width

def input_width(self) -> int

Image input width

item	description
type	func
return	input width.
static	False

C++ defination code:
int input_width()

input_height

def input_height(self) -> int

Image input height

item	description
type	func
return	input height.
static	False

C++ defination code:
int input_height()

input_format

def input_format(self) -> maix.image.Format

Image input format

item	description
type	func
return	input format.
static	False

C++ defination code:
maix::image::Format input_format()

set_image

def set_image(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> maix.err.Err

Set image and will encode image.\nYou can set image once and call send multiple times.

item	description
type	func
param	img: the image you want to use. fit: Image resize fit method, only used when img size not equal to model input.
return	err.Err return err.Err.ERR_NONE is no error happen.
static	False

C++ defination code:

err::Err set_image(maix::image::Image &img, maix::image::Fit fit = maix::image::Fit::FIT_CONTAIN)

clear_image

def clear_image(self) -> None

Clear image, InternVL2.5 based on Qwen2.5, so you can clear image and only use LLM function.

item	description
type	func
static	False

C++ defination code:
void clear_image()

is_image_set

def is_image_set(self) -> bool

Whether image set by set_image

item	description
type	func
return	Return true if image set by set_image function, or return false.
static	False

C++ defination code:
bool is_image_set()

send

def send(self, msg: str) -> InternVLResp

Send message to model

item	description
type	func
param	msg: message to send
return	model response
static	False

C++ defination code:

nn::InternVLResp send(const std::string &msg)

cancel

def cancel(self) -> None

Cancel running

item	description
type	func
static	False

C++ defination code:
void cancel()

version

def version(self) -> str

Get model version

item	description
type	func
return	model version
static	False

C++ defination code:
std::string version()

post_config

InternVL post config, default will read config from model mud file, you can also set it manually here.

item	description
type	var
static	False
readonly	False

C++ defination code:
nn::InternVLPostConfig post_config

modbus