maix.nn
maix.nn module
You can use
maix.nn
to access this module with MaixPy
This module is generated from MaixPy and MaixCDK
Module
module | brief |
---|---|
F | maix.nn.F module |
Enum
Layout
layer layout types
item | describe |
---|---|
values | NCHW: [N, C, H, W] NHWC: [N, H, W, C] UNKNOWN: |
C++ defination code:
enum class Layout { NCHW = 0, // [N, C, H, W] NHWC = 1, // [N, H, W, C] UNKNOWN = 4 }
SpeechDevice
speech device
item | describe |
---|---|
values | DEVICE_NONE: DEVICE_PCM: DEVICE_MIC: DEVICE_WAV: |
C++ defination code:
enum class SpeechDevice { DEVICE_NONE = -1, DEVICE_PCM, DEVICE_MIC, DEVICE_WAV, }
SpeechDecoder
speech decoder type
item | describe |
---|---|
values | DECODER_RAW: DECODER_DIG: DECODER_LVCSR: DECODER_KWS: DECODER_ALL: |
C++ defination code:
enum class SpeechDecoder { DECODER_RAW = 1, DECODER_DIG = 2, DECODER_LVCSR = 4, DECODER_KWS = 8, DECODER_ALL = 65535, }
Variable
Function
Class
Object
Object for detect result
C++ defination code:
class Object
__init__
def __init__(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], angle: float = -9999) -> None
Constructor of Object for detect result
item | description |
---|---|
type | func |
param | x: left top x y: left top y w: width h: height class_id: class id score: score |
static | False |
C++ defination code:
Object(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), float angle = -9999)
__str__
def __str__(self) -> str
Object info to string
item | description |
---|---|
type | func |
return | Object info string |
static | False |
C++ defination code:
std::string to_str()
get_obb_points
def get_obb_points(self) -> list[int]
Get OBB(oriented bounding box) points, auto calculated according to x,y,w,h,angle
item | description |
---|---|
type | func |
static | False |
C++ defination code:
std::vector<int> get_obb_points()
x
Object left top coordinate x
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int x
y
Object left top coordinate y
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int y
w
Object width
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int w
h
Object height
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int h
class_id
Object class id
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int class_id
score
Object score
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float score
points
keypoints
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> points
angle
Rotate angle, -9999 means not set, value is a percentage, need to multiply 180 to get the real angle or multiply PI to get the radian.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float angle
seg_mask
segmentation mask, uint8 list type, shape is h * w but flattened to one dimension, value fron 0 to 255.
item | description |
---|---|
type | var |
attention | For efficiency, it's a pointer in C++, use this carefully! |
static | False |
readonly | False |
C++ defination code:
image::Image *seg_mask
ObjectFloat
Object for detect result
C++ defination code:
class ObjectFloat
__init__
def __init__(self, x: float = 0, y: float = 0, w: float = 0, h: float = 0, class_id: float = 0, score: float = 0, points: list[float] = [], angle: float = -1) -> None
Constructor of Object for detect result
item | description |
---|---|
type | func |
param | x: left top x y: left top y w: width h: height class_id: class id score: score |
static | False |
C++ defination code:
ObjectFloat(float x = 0, float y = 0, float w = 0, float h = 0, float class_id = 0, float score = 0, std::vector<float> points = std::vector<float>(), float angle = -1)
__str__
def __str__(self) -> str
Object info to string
item | description |
---|---|
type | func |
return | Object info string |
static | False |
C++ defination code:
std::string to_str()
x
Object left top coordinate x
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float x
y
Object left top coordinate y
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float y
w
Object width
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float w
h
Object height
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float h
class_id
Object class id
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float class_id
score
Object score
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float score
points
keypoints
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> points
angle
Rotate angle
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float angle
Objects
Objects Class for detect result
C++ defination code:
class Objects
__init__
def __init__(self) -> None
Constructor of Objects class
item | description |
---|---|
type | func |
static | False |
C++ defination code:
Objects()
add
def add(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], angle: float = -1) -> Object
Add object to objects
item | description |
---|---|
type | func |
throw | Throw exception if no memory |
static | False |
C++ defination code:
nn::Object &add(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), float angle = -1)
remove
def remove(self, idx: int) -> maix.err.Err
Remove object form objects
item | description |
---|---|
type | func |
static | False |
C++ defination code:
err::Err remove(int idx)
at
def at(self, idx: int) -> Object
Get object item
item | description |
---|---|
type | func |
static | False |
C++ defination code:
nn::Object &at(int idx)
__getitem__
def __getitem__(self, idx: int) -> Object
Get object item
item | description |
---|---|
type | func |
static | False |
C++ defination code:
nn::Object &operator[](int idx)
__len__
def __len__(self) -> int
Get size
item | description |
---|---|
type | func |
static | False |
C++ defination code:
size_t size()
__iter__
def __iter__(self) -> typing.Iterator
Begin
item | description |
---|---|
type | func |
static | False |
C++ defination code:
std::vector<Object*>::iterator begin()
HandLandmarks
HandLandmarks class
C++ defination code:
class HandLandmarks
__init__
def __init__(self, model: str = '') -> None
Constructor of HandLandmarks class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
HandLandmarks(const string &model = "")
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.7, iou_th: float = 0.45, conf_th2: float = 0.8, landmarks_rel: bool = False) -> Objects
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Hand detect confidence threshold, default 0.7. iou_th: IoU threshold, default 0.45. conf_th2: Hand detect confidence second time check threshold, default 0.8. landmarks_rel: outputs the relative coordinates of 21 points with respect to the top-left vertex of the hand. In obj.points, the last 21x2 values are arranged as x0y0x1y1...x20y20. Value from 0 to obj.w. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. Object's points value format: box_topleft_x, box_topleft_y, box_topright_x, box_topright_y, box_bottomright_x, box_bottomright_y, box_bottomleft_x, box_bottomleft_y, x0, y0, z1, x1, y1, z2, ..., x20, y20, z20. If landmarks_rel is True, will be box_topleft_x, box_topleft_y...,x20,y20,z20,x0_rel,y0_rel,...,x20_rel,y20_rel. Z is depth, the larger the value, the farther away from the palm, and the positive value means closer to the camera. |
static | False |
C++ defination code:
nn::Objects *detect(image::Image &img, float conf_th = 0.7, float iou_th = 0.45, float conf_th2 = 0.8, bool landmarks_rel = false)
input_size
def input_size(self, detect: bool = True) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
param | detect: detect or landmarks model, default true. |
return | model input size |
static | False |
C++ defination code:
image::Size input_size(bool detect = true)
input_width
def input_width(self, detect: bool = True) -> int
Get model input width
item | description |
---|---|
type | func |
param | detect: detect or landmarks model, default true. |
return | model input size of width |
static | False |
C++ defination code:
int input_width(bool detect = true)
input_height
def input_height(self, detect: bool = True) -> int
Get model input height
item | description |
---|---|
type | func |
param | detect: detect or landmarks model, default true. |
return | model input size of height |
static | False |
C++ defination code:
int input_height(bool detect = true)
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
draw_hand
def draw_hand(self, img: maix.image.Image, leftright: int, points: list[int], r_min: int = 4, r_max: int = 10, box: bool = True, box_thickness: int = 1, box_color_l: maix.image.Color = ..., box_color_r: maix.image.Color = ...) -> None
Draw hand and landmarks on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. leftright,: 0 means left, 1 means right points: points result from detect method: box_topleft_x, box_topleft_y, box_topright_x, box_topright_y, box_bottomright_x, box_bottomright_y, box_bottomleft_x, box_bottomleft_y, x0, y0, z1, x1, y1, z2, ..., x20, y20, z20 r_min: min radius of points. r_max: min radius of points. box: draw box or not, default true. box_color: color of box. |
static | False |
C++ defination code:
void draw_hand(image::Image &img, int leftright, const std::vector<int> &points, int r_min = 4, int r_max = 10, bool box = true, int box_thickness = 1, image::Color box_color_l = image::COLOR_RED, image::Color box_color_r = image::COLOR_GREEN)
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
label_path
Label file path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string label_path
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
FaceLandmarksObject
FaceLandmarksObject class
C++ defination code:
class FaceLandmarksObject
__init__
def __init__(self) -> None
Valid or not(score > conf_th when detect).
item | description |
---|---|
type | func |
static | False |
C++ defination code:
FaceLandmarksObject()
valid
Valid or not(score > conf_th when detect).
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool valid
score
whether face in image score, value from 0 to 1.0.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float score
points
landmarks points, format: x0, y0, ..., xn-1, yn-1.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> points
points_z
landmarks points, format: z0, z1, ..., zn-1.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> points_z
FaceLandmarks
FaceLandmarks class
C++ defination code:
class FaceLandmarks
__init__
def __init__(self, model: str = '') -> None
Constructor of FaceLandmarks class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
FaceLandmarks(const string &model = "")
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.5, landmarks_abs: bool = True, landmarks_rel: bool = False) -> FaceLandmarksObject
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Hand detect confidence threshold, default 0.7. landmarks_rel: outputs the relative coordinates of 21 points with respect to the top-left vertex of the hand. In obj.points, the last 21x2 values are arranged as x0y0x1y1...x20y20. Value from 0 to obj.w. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. Object's points value format: box_topleft_x, box_topleft_y, box_topright_x, box_topright_y, box_bottomright_x, box_bottomright_y, box_bottomleft_x, box_bottomleft_y, x0, y0, z1, x1, y1, z2, ..., x20, y20, z20. If landmarks_rel is True, will be box_topleft_x, box_topleft_y...,x20,y20,z20,x0_rel,y0_rel,...,x20_rel,y20_rel. Z is depth, the larger the value, the farther away from the palm, and the positive value means closer to the camera. |
static | False |
C++ defination code:
nn::FaceLandmarksObject *detect(image::Image &img, float conf_th = 0.5, bool landmarks_abs = true, bool landmarks_rel = false)
crop_image
def crop_image(self, img: maix.image.Image, x: int, y: int, w: int, h: int, points: list[int], new_width: int = -1, new_height: int = -1, scale: float = 1.2) -> maix.image.Image
Crop image from source image by 2 points(2 eyes)
item | description |
---|---|
type | func |
param | x,y,w,h: face rectangle, x,y is left-top point. img: source image points: 2 points, eye_left_x, eye_left_y, eye_right_x, eye_right_y scale: crop size scale relative to rectangle's max side length(w or h), final value is scale *max(w, h) ,default 1.2. |
static | False |
C++ defination code:
maix::image::Image *crop_image(maix::image::Image &img, int x, int y, int w, int h, std::vector<int> points, int new_width = -1, int new_height = -1, float scale = 1.2)
input_size
def input_size(self, detect: bool = True) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
param | detect: detect or landmarks model, default true. |
return | model input size |
static | False |
C++ defination code:
image::Size input_size(bool detect = true)
input_width
def input_width(self, detect: bool = True) -> int
Get model input width
item | description |
---|---|
type | func |
param | detect: detect or landmarks model, default true. |
return | model input size of width |
static | False |
C++ defination code:
int input_width(bool detect = true)
input_height
def input_height(self, detect: bool = True) -> int
Get model input height
item | description |
---|---|
type | func |
param | detect: detect or landmarks model, default true. |
return | model input size of height |
static | False |
C++ defination code:
int input_height(bool detect = true)
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
draw_face
def draw_face(self, img: maix.image.Image, points: list[int], num: int, points_z: list[int] = [], r_min: int = 2, r_max: int = 4) -> None
Draw hand and landmarks on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. leftright,: 0 means left, 1 means right points: points result from detect method: x0, y0, x1, y1, ..., xn-1, yn-1. points_z: points result from detect method: z0, z1, ..., zn-1. r_min: min radius of points. r_max: min radius of points. |
static | False |
C++ defination code:
void draw_face(image::Image &img, const std::vector<int> &points, int num, const std::vector<int> &points_z=std::vector<int>(), int r_min = 2, int r_max = 4)
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
landmarks_num
landmarks number.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int landmarks_num
DepthAnything
DepthAnything
C++ defination code:
class DepthAnything
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Construct a new DepthAnything object
item | description |
---|---|
type | func |
param | model: MUD model path, if empty, will not load model, you can call load() later. if not empty, will load model and will raise err::Exception if load failed. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
static | False |
C++ defination code:
DepthAnything(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file, model format is .mud,\nMUD file should contain [extra] section, have key-values:\n- model_type: depth_anything_v2\n- input_type: rgb or bgr\n- mean: 123.675, 116.28, 103.53\n- scale: 0.017124753831663668, 0.01750700280112045, 0.017429193899782137\n- labels: imagenet_classes.txt
item | description |
---|---|
type | func |
param | model: MUD model path |
return | error code, if load failed, return error code |
static | False |
C++ defination code:
err::Err load(const string &model)
get_depth
def get_depth(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> maix.tensor.Tensor
Forward model and get raw image depth estimation data.
item | description |
---|---|
type | func |
param | img: image, format should match model input_type, or will raise err.Exception fit: image resize fit mode if input image not equal to model' input size, will auto resize to model's input size then detect, and recover to image input size. Default Fit.FIT_CONTAIN, see image.Fit. |
throw | If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error. |
return | result, a tensor.Tensor object. If in dual_buff mode, value can be None(in Python) or nullptr(in C++) when not ready. In C++, you need to delete it after use. |
static | False |
C++ defination code:
tensor::Tensor *get_depth(image::Image &img, image::Fit fit = image::FIT_CONTAIN)
get_depth_image
def get_depth_image(self, img: maix.image.Image, fit: maix.image.Fit = ..., cmap: maix.image.CMap = ...) -> maix.image.Image
Forward model and get image depth estimation data normlized to [0, 255] and as a image.Image object.
item | description |
---|---|
type | func |
param | img: image, format should match model input_type, or will raise err.Exception fit: image resize fit mode if input image not equal to model' input size, will auto resize to model's input size then detect, and recover to image input size. Default Fit.FIT_CONTAIN, see image.Fit. cmap: Color map used convert grayscale distance estimation image to RGB image. Diiferent cmap will influence finally image. Default image.CMap.INFERNO. |
throw | If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error. |
return | result, a image::Image object. If in dual_buff mode, value can be None(in Python) or nullptr(in C++) when not ready. In C++, you need to delete it after use. |
static | False |
C++ defination code:
image::Image *get_depth_image(image::Image &img, image::Fit fit = image::FIT_CONTAIN, image::CMap cmap = image::CMap::INFERNO)
input_size
def input_size(self) -> maix.image.Size
Get model input size, only for image input
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width, only for image input
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height, only for image input
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format, only for image input
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
input_shape
def input_shape(self) -> list[int]
Get input shape, if have multiple input, only return first input shape
item | description |
---|---|
type | func |
return | input shape, list type |
static | False |
C++ defination code:
std::vector<int> input_shape()
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
Whisper
Whiper class
C++ defination code:
class Whisper
__init__
def __init__(self, model: str = '', language: str = 'zh') -> None
Constructor of Whisper class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. language: language code, default "zh", supported language code: "zh", "en" |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
Whisper(const string &model = "", std::string language = "zh")
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
transcribe
def transcribe(self, file: str) -> str
Transcribe audio file to text
item | description |
---|---|
type | func |
note | If the wav file has multiple channels, only the first channel will be used. |
param | file: Pass in an audio file, supporting files in WAV format. |
return | The output result after automatic speech recognition. |
static | False |
C++ defination code:
std::string transcribe(std::string &file)
transcribe_raw
def transcribe_raw(self, pcm: maix.Bytes(bytes), sample_rate: int = 16000, channels: int = 1, bits_per_frame: int = 16) -> str
Transcribe pcm data to text
item | description |
---|---|
type | func |
param | pcm: RAW data |
return | The output result after automatic speech recognition. |
static | False |
C++ defination code:
std::string transcribe_raw(Bytes *pcm, int sample_rate = 16000, int channels = 1, int bits_per_frame = 16)
input_pcm_samplerate
def input_pcm_samplerate(self) -> int
Get input pcm samplerate
item | description |
---|---|
type | func |
return | input pcm samplerate |
static | False |
C++ defination code:
int input_pcm_samplerate()
input_pcm_channels
def input_pcm_channels(self) -> int
Get input pcm channels
item | description |
---|---|
type | func |
return | input pcm channels |
static | False |
C++ defination code:
int input_pcm_channels()
input_pcm_bits_per_frame
def input_pcm_bits_per_frame(self) -> int
Get input pcm bits per frame
item | description |
---|---|
type | func |
return | input pcm bits per frame |
static | False |
C++ defination code:
int input_pcm_bits_per_frame()
SelfLearnClassifier
SelfLearnClassifier
C++ defination code:
class SelfLearnClassifier
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Construct a new SelfLearnClassifier object
item | description |
---|---|
type | func |
param | model: MUD model path, if empty, will not load model, you can call load_model() later. if not empty, will load model and will raise err::Exception if load failed. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
static | False |
C++ defination code:
SelfLearnClassifier(const std::string &model = "", bool dual_buff = true)
load_model
def load_model(self, model: str) -> maix.err.Err
Load model from file, model format is .mud,\nMUD file should contain [extra] section, have key-values:\n- model_type: classifier_no_top\n- input_type: rgb or bgr\n- mean: 123.675, 116.28, 103.53\n- scale: 0.017124753831663668, 0.01750700280112045, 0.017429193899782137
item | description |
---|---|
type | func |
param | model: MUD model path |
return | error code, if load failed, return error code |
static | False |
C++ defination code:
err::Err load_model(const string &model)
classify
def classify(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> list[tuple[int, float]]
Classify image
item | description |
---|---|
type | func |
param | img: image, format should match model input_type, or will raise err.Exception fit: image resize fit mode, default Fit.FIT_COVER, see image.Fit. |
throw | If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error. |
return | result, a list of (idx, distance), smaller distance means more similar. In C++, you need to delete it after use. |
static | False |
C++ defination code:
std::vector<std::pair<int, float>> *classify(image::Image &img, image::Fit fit = image::FIT_COVER)
add_class
def add_class(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> None
Add a class to recognize
item | description |
---|---|
type | func |
param | img: Add a image as a new class fit: image resize fit mode, default Fit.FIT_COVER, see image.Fit. |
static | False |
C++ defination code:
void add_class(image::Image &img, image::Fit fit = image::FIT_COVER)
class_num
def class_num(self) -> int
Get class number
item | description |
---|---|
type | func |
static | False |
C++ defination code:
int class_num()
rm_class
def rm_class(self, idx: int) -> maix.err.Err
Remove a class
item | description |
---|---|
type | func |
param | idx: index, value from 0 to class_num(); |
static | False |
C++ defination code:
err::Err rm_class(int idx)
add_sample
def add_sample(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> None
Add sample, you should call learn method after add some samples to learn classes.\nSample image can be any of classes we already added.
item | description |
---|---|
type | func |
param | img: Add a image as a new sample. |
static | False |
C++ defination code:
void add_sample(image::Image &img, image::Fit fit = image::FIT_COVER)
rm_sample
def rm_sample(self, idx: int) -> maix.err.Err
Remove a sample
item | description |
---|---|
type | func |
param | idx: index, value from 0 to sample_num(); |
static | False |
C++ defination code:
err::Err rm_sample(int idx)
sample_num
def sample_num(self) -> int
Get sample number
item | description |
---|---|
type | func |
static | False |
C++ defination code:
int sample_num()
learn
def learn(self) -> int
Start auto learn class features from classes image and samples.\nYou should call this method after you add some samples.
item | description |
---|---|
type | func |
return | learn epoch(times), 0 means learn nothing. |
static | False |
C++ defination code:
int learn()
clear
def clear(self) -> None
Clear all class and samples
item | description |
---|---|
type | func |
static | False |
C++ defination code:
void clear()
input_size
def input_size(self) -> maix.image.Size
Get model input size, only for image input
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width, only for image input
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height, only for image input
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format, only for image input
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
input_shape
def input_shape(self) -> list[int]
Get input shape, if have multiple input, only return first input shape
item | description |
---|---|
type | func |
return | input shape, list type |
static | False |
C++ defination code:
std::vector<int> input_shape()
save
def save(self, path: str, labels: list[str] = []) -> maix.err.Err
Save features and labels to a binary file
item | description |
---|---|
type | func |
param | path: file path to save, e.g. /root/my_classes.bin labels: class labels, can be None, or length must equal to class num, or will return err::Err |
return | maix.err.Err if labels exists but length not equal to class num, or save file failed, or class num is 0. |
static | False |
C++ defination code:
err::Err save(const std::string &path, const std::vector<std::string> &labels = std::vector<std::string>())
load
def load(self, path: str) -> list[str]
Load features info from binary file
item | description |
---|---|
type | func |
param | path: feature info binary file path, e.g. /root/my_classes.bin |
static | False |
C++ defination code:
std::vector<std::string> load(const std::string &path)
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
label_path
Label file path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string label_path
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
YOLO11
YOLO11 class
C++ defination code:
class YOLO11
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Constructor of YOLO11 class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
YOLO11(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., keypoint_th: float = 0.5, sort: int = 0) -> Objects
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. keypoint_th: keypoint threshold, default 0.5, only for yolo11-pose model. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. If model is yolo11-pose, object's points have value, and if points' value < 0 means that point is invalid(conf < keypoint_th). |
static | False |
C++ defination code:
nn::Objects *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, float keypoint_th = 0.5, int sort = 0)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
draw_pose
def draw_pose(self, img: maix.image.Image, points: list[int], radius: int = 4, color: maix.image.Color = ..., colors: list[maix.image.Color] = [], body: bool = True, close: bool = False) -> None
Draw pose keypoints on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. points: keypoits, int list type, [x, y, x, y ...] radius: radius of points. color: color of points. colors: assign colors for points, list type, element is image.Color object. body: true, if points' length is 172 and body is ture, will draw lines as human body, if set to false won't draw lines, default true. close*: connect all points to close a polygon, default false. |
static | False |
C++ defination code:
void draw_pose(image::Image &img, std::vector<int> points, int radius = 4, image::Color color = image::COLOR_RED, const std::vector<image::Color> &colors = std::vector<image::Color>(), bool body = true, bool close = false)
draw_seg_mask
def draw_seg_mask(self, img: maix.image.Image, x: int, y: int, seg_mask: maix.image.Image, threshold: int = 127) -> None
Draw segmentation on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. seg_mask: segmentation mask image by detect method, a grayscale image threshold: only mask's value > threshold will be draw on image, value from 0 to 255. |
static | False |
C++ defination code:
void draw_seg_mask(image::Image &img, int x, int y, image::Image &seg_mask, int threshold = 127)
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
label_path
Label file path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string label_path
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
Retinaface
Retinaface class
C++ defination code:
class Retinaface
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Constructor of Retinaface class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
Retinaface(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.4, iou_th: float = 0.45, fit: maix.image.Fit = ...) -> list[Object]
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.4. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. |
static | False |
C++ defination code:
std::vector<nn::Object> *detect(image::Image &img, float conf_th = 0.4, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
YOLOWorld
YOLOWorld class
C++ defination code:
class YOLOWorld
__init__
def __init__(self, model: str = '', text_feature: str = '', labels: str = '', dual_buff: bool = True) -> None
Constructor of YOLOWorld class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. text_feature: class text feature path, more info refer to load method, default empty, you can load class text feature later by load function. labels: Class labels or labels file path. If string class labels: labels split by comma, e.g. "person, car, cat". If file path: labels file path, each line is a label. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
YOLOWorld(const string &model = "", const string &text_feature = "", const string &labels = "", bool dual_buff = true)
load
def load(self, model: str, text_feature: str, labels: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load text_feature: Class text feature bin file path. labels: Class labels or labels file path. If string class labels: labels split by comma, e.g. "person, car, cat". If file path: labels file path, each line is a label. |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model, const string &text_feature, const string &labels)
learn_text_feature
def learn_text_feature(model: str, labels: list[str], feature_path: str, labels_path: str) -> maix.err.Err
Set detector class labels dynamically, will generate class text feature and save to text_feature path set in load method or constructor.
item | description |
---|---|
type | func |
param | labels: class labels you want to recognize, list type. e.g. ["person", "car", "cat"] |
return | err::Err |
static | True |
C++ defination code:
static err::Err learn_text_feature(const std::string &model, std::vector<std::string> labels, const std::string &feature_path, const std::string &labels_path)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., sort: int = 0) -> Objects
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. |
static | False |
C++ defination code:
nn::Objects *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, int sort = 0)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
YOLOv5
YOLOv5 class
C++ defination code:
class YOLOv5
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Constructor of YOLOv5 class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
YOLOv5(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., sort: int = 0) -> list[Object]
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. |
static | False |
C++ defination code:
std::vector<nn::Object> *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, int sort = 0)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
label_path
Label file path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string label_path
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
anchors
Get anchors
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> anchors
Classifier
Classifier
C++ defination code:
class Classifier
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Construct a new Classifier object
item | description |
---|---|
type | func |
param | model: MUD model path, if empty, will not load model, you can call load() later. if not empty, will load model and will raise err::Exception if load failed. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
static | False |
C++ defination code:
Classifier(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file, model format is .mud,\nMUD file should contain [extra] section, have key-values:\n- model_type: classifier\n- input_type: rgb or bgr\n- mean: 123.675, 116.28, 103.53\n- scale: 0.017124753831663668, 0.01750700280112045, 0.017429193899782137\n- labels: imagenet_classes.txt
item | description |
---|---|
type | func |
param | model: MUD model path |
return | error code, if load failed, return error code |
static | False |
C++ defination code:
err::Err load(const string &model)
classify
def classify(self, img: maix.image.Image, softmax: bool = True, fit: maix.image.Fit = ...) -> list[tuple[int, float]]
Forward image to model, get result. Only for image input, use classify_raw for tensor input.
item | description |
---|---|
type | func |
param | img: image, format should match model input_type, or will raise err.Exception softmax: if true, will do softmax to result, or will return raw value fit: image resize fit mode, default Fit.FIT_COVER, see image.Fit. |
throw | If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error. |
return | result, a list of (label, score). If in dual_buff mode, value can be one element list and score is zero when not ready. In C++, you need to delete it after use. |
static | False |
C++ defination code:
std::vector<std::pair<int, float>> *classify(image::Image &img, bool softmax = true, image::Fit fit = image::FIT_COVER)
classify_raw
def classify_raw(self, data: maix.tensor.Tensor, softmax: bool = True) -> list[tuple[int, float]]
Forward tensor data to model, get result
item | description |
---|---|
type | func |
param | data: tensor data, format should match model input_type, or will raise err.Excetion softmax: if true, will do softmax to result, or will return raw value |
throw | If error occurred, will raise err::Exception, you can find reason in log, mostly caused by args error or hardware error. |
return | result, a list of (label, score). In C++, you need to delete it after use. |
static | False |
C++ defination code:
std::vector<std::pair<int, float>> *classify_raw(tensor::Tensor &data, bool softmax = true)
input_size
def input_size(self) -> maix.image.Size
Get model input size, only for image input
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width, only for image input
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height, only for image input
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format, only for image input
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
input_shape
def input_shape(self) -> list[int]
Get input shape, if have multiple input, only return first input shape
item | description |
---|---|
type | func |
return | input shape, list type |
static | False |
C++ defination code:
std::vector<int> input_shape()
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
label_path
Label file path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string label_path
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
PP_OCR
PP_OCR class
C++ defination code:
class PP_OCR
__init__
def __init__(self, model: str = '') -> None
Constructor of PP_OCR class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
PP_OCR(const string &model = "")
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, thresh: float = 0.3, box_thresh: float = 0.6, fit: maix.image.Fit = ..., char_box: bool = False) -> ...
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. thresh: Confidence threshold where pixels have charactor, default 0.3. box_thresh: Box threshold, the box prob higher than this value will be valid, default 0.6. fit: Resize method, default image.Fit.FIT_CONTAIN. char_box: Calculate every charactor's box, default false, if true then you can get charactor's box by nn.OCR_Object's char_boxes attribute. |
throw | If image format not match model input format or no memory, will throw err::Exception. |
return | nn.OCR_Objects type. In C++, you should delete it after use. |
static | False |
C++ defination code:
nn::OCR_Objects *detect(image::Image &img, float thresh = 0.3, float box_thresh = 0.6, maix::image::Fit fit = maix::image::FIT_CONTAIN, bool char_box = false)
recognize
def recognize(self, img: maix.image.Image, box_points: list[int] = []) -> ...
Only recognize, not detect
item | description |
---|---|
type | func |
param | img: image to recognize chractors, can be a stanrd cropped charactors image, if crop image not standard, you can use box_points to assgin where the charactors' 4 corner is. box_points: list type, length must be 8 or 0, default empty means not transfer image to standard image. 4 points postiion, format: [x1, y1, x2, y2, x3, y3, x4, y4], point 1 at the left-top, point 2 right-top... char_box: Calculate every charactor's box, default false, if true then you can get charactor's box by nn.OCR_Object's char_boxes attribute. |
static | False |
C++ defination code:
nn::OCR_Object *recognize(image::Image &img, const std::vector<int> &box_points = std::vector<int>())
draw_seg_mask
def draw_seg_mask(self, img: maix.image.Image, x: int, y: int, seg_mask: maix.image.Image, threshold: int = 127) -> None
Draw segmentation on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. seg_mask: segmentation mask image by detect method, a grayscale image threshold: only mask's value > threshold will be draw on image, value from 0 to 255. |
static | False |
C++ defination code:
void draw_seg_mask(image::Image &img, int x, int y, image::Image &seg_mask, int threshold = 127)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
rec_mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> rec_mean
rec_scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> rec_scale
labels
labels (charactors)
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<std::string> labels
det
model have detect model
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool det
rec
model have recognize model
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool rec
YOLOv8
YOLOv8 class
C++ defination code:
class YOLOv8 : public YOLO11
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Constructor of YOLOv8 class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
YOLOv8(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ..., keypoint_th: float = 0.5, sort: int = 0) -> Objects
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. keypoint_th: keypoint threshold, default 0.5, only for yolov8-pose model. sort: sort result according to object size, default 0 means not sort, 1 means bigger in front, -1 means smaller in front. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. If model is yolov8-pose, object's points have value, and if points' value < 0 means that point is invalid(conf < keypoint_th). |
static | False |
C++ defination code:
nn::Objects *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN, float keypoint_th = 0.5, int sort = 0)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
draw_pose
def draw_pose(self, img: maix.image.Image, points: list[int], radius: int = 4, color: maix.image.Color = ..., colors: list[maix.image.Color] = [], body: bool = True, close: bool = False) -> None
Draw pose keypoints on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. points: keypoits, int list type, [x, y, x, y ...] radius: radius of points. color: color of points. colors: assign colors for points, list type, element is image.Color object. body: true, if points' length is 172 and body is ture, will draw lines as human body, if set to false won't draw lines, default true. close*: connect all points to close a polygon, default false. |
static | False |
C++ defination code:
void draw_pose(image::Image &img, std::vector<int> points, int radius = 4, image::Color color = image::COLOR_RED, const std::vector<image::Color> &colors = std::vector<image::Color>(), bool body = true, bool close = false)
draw_seg_mask
def draw_seg_mask(self, img: maix.image.Image, x: int, y: int, seg_mask: maix.image.Image, threshold: int = 127) -> None
Draw segmentation on image
item | description |
---|---|
type | func |
param | img: image object, maix.image.Image type. seg_mask: segmentation mask image by detect method, a grayscale image threshold: only mask's value > threshold will be draw on image, value from 0 to 255. |
static | False |
C++ defination code:
void draw_seg_mask(image::Image &img, int x, int y, image::Image &seg_mask, int threshold = 127)
labels
Labels list
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<string> labels
label_path
Label file path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string label_path
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
MeloTTS
MeloTTS class
C++ defination code:
class MeloTTS
__init__
def __init__(self, model: str = '', language: str = 'zh', speed: float = 0.800000011920929, noise_scale: float = 0.30000001192092896, noise_scale_w: float = 0.6000000238418579, sdp_ratio: float = 0.20000000298023224) -> None
Constructor of MeloTTS class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. language: language code, default "zh", supported language code: "zh" speed: the speech rate of the audio is controlled by this value,lower values result in slower reading speed. default is 0.8 noise_scale: this parameter controls the randomness in speech. increasing the value results in more varied and less deterministic speech output.default is 0.3 noise_scale_w: this parameter controls the randomness in speech alignment. while a higher value can enhance naturalness, overly high values may introduce instability or distortion in the audio. default is 0.6 sdp_ratio: the higher the alignment weight, the more natural the speech sounds, but excessive values may result in instability. default is 0.2 |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
MeloTTS(const string &model = "", std::string language = "zh", double speed = 0.8f, double noise_scale = 0.3f, double noise_scale_w = 0.6f, double sdp_ratio = 0.2f)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
forward
def forward(*args, **kwargs)
Text to speech
item | description |
---|---|
type | func |
param | text: input text path: The output path of the voice file, the default sampling rate is 44100, the number of channels is 1, and the number of sampling bits is 16. default is empty. output_pcm: Enable or disable the output of raw PCM data. The default output sampling rate is 44100, the number of channels is 1, and the sampling depth is 16 bits. default is false. |
return | raw PCM data |
static | False |
C++ defination code:
Bytes *forward(std::string text, std::string path = "", bool output_pcm = false)
samplerate
def samplerate(self) -> int
Get pcm samplerate
item | description |
---|---|
type | func |
return | pcm samplerate |
static | False |
C++ defination code:
int samplerate()
speed
def speed(self) -> float
Get the speed of the text
item | description |
---|---|
type | func |
return | text speed |
static | False |
C++ defination code:
double speed()
OCR_Box
Object for OCR detect box
C++ defination code:
class OCR_Box
__init__
def __init__(self, x1: int = 0, y1: int = 0, x2: int = 0, y2: int = 0, x3: int = 0, y3: int = 0, x4: int = 0, y4: int = 0) -> None
OCR_Box constructor
item | description |
---|---|
type | func |
static | False |
C++ defination code:
OCR_Box(int x1 = 0, int y1 = 0, int x2 = 0, int y2 = 0, int x3 = 0, int y3 = 0, int x4 = 0, int y4 = 0)
x1
left top point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int x1
y1
left top point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int y1
x2
right top point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int x2
y2
right top point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int y2
x3
right bottom point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int x3
y3
right bottom point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int y3
x4
left bottom point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int x4
y4
left bottom point of box
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int y4
to_list
def to_list(self) -> list[int]
convert box point to a list type.
item | description |
---|---|
type | func |
return | list type, element is int type, value [x1, y1, x2, y2, x3, y3, x4, y4]. |
static | False |
C++ defination code:
std::vector<int> to_list()
OCR_Object
Object for OCR detect result
C++ defination code:
class OCR_Object
__init__
def __init__(self, box: OCR_Box, idx_list: list[int], char_list: list[str], score: float = 0, char_pos: list[int] = []) -> None
Constructor of Object for OCR detect result
item | description |
---|---|
type | func |
param | score: score |
static | False |
C++ defination code:
OCR_Object(const nn::OCR_Box &box, const std::vector<int> &idx_list, const std::vector<std::string> &char_list, float score = 0, const std::vector<int> &char_pos = std::vector<int>())
box
OCR_Object box, 4 points box, first point at the left-top, clock-wise.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
nn::OCR_Box box
score
Object score
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float score
idx_list
chars' idx list, element is int type.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> idx_list
char_pos
Chars' position relative to left
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> char_pos
char_str
def char_str(self) -> str
Get OCR_Object's charactors, return a string type.
item | description |
---|---|
type | func |
return | All charactors in string type. |
static | False |
C++ defination code:
const std::string &char_str()
char_list
def char_list(self) -> list[str]
Get OCR_Object's charactors, return a list type.
item | description |
---|---|
type | func |
return | All charactors in list type. |
static | False |
C++ defination code:
const std::vector<std::string> &char_list()
update_chars
def update_chars(self, char_list: list[str]) -> None
Set OCR_Object's charactors
item | description |
---|---|
type | func |
param | char_list: All charactors in list type. |
static | False |
C++ defination code:
void update_chars(const std::vector<std::string> &char_list)
__str__
def __str__(self) -> str
OCR_Object info to string
item | description |
---|---|
type | func |
return | OCR_Object info string |
static | False |
C++ defination code:
std::string to_str()
OCR_Objects
OCR_Objects Class for detect result
C++ defination code:
class OCR_Objects
__init__
def __init__(self) -> None
Constructor of OCR_Objects class
item | description |
---|---|
type | func |
static | False |
C++ defination code:
OCR_Objects()
add
def add(self, box: OCR_Box, idx_list: list[int], char_list: list[str], score: float = 0, char_pos: list[int] = []) -> OCR_Object
Add object to objects
item | description |
---|---|
type | func |
throw | Throw exception if no memory |
static | False |
C++ defination code:
nn::OCR_Object &add(const nn::OCR_Box &box, const std::vector<int> &idx_list, const std::vector<std::string> &char_list, float score = 0, const std::vector<int> &char_pos = std::vector<int>())
remove
def remove(self, idx: int) -> maix.err.Err
Remove object form objects
item | description |
---|---|
type | func |
static | False |
C++ defination code:
err::Err remove(int idx)
at
def at(self, idx: int) -> OCR_Object
Get object item
item | description |
---|---|
type | func |
static | False |
C++ defination code:
nn::OCR_Object &at(int idx)
__getitem__
def __getitem__(self, idx: int) -> OCR_Object
Get object item
item | description |
---|---|
type | func |
static | False |
C++ defination code:
nn::OCR_Object &operator[](int idx)
__len__
def __len__(self) -> int
Get size
item | description |
---|---|
type | func |
static | False |
C++ defination code:
size_t size()
__iter__
def __iter__(self) -> typing.Iterator
Begin
item | description |
---|---|
type | func |
static | False |
C++ defination code:
std::vector<OCR_Object*>::iterator begin()
MUD
MUD(model universal describe file) class
C++ defination code:
class MUD
__init__
def __init__(self, model_path: str = '') -> None
MUD constructor
item | description |
---|---|
type | func |
param | model_path: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function. |
static | False |
C++ defination code:
MUD(const std::string &model_path = "")
load
def load(self, model_path: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model_path: direction [in], model file path, model format can be MUD(model universal describe file) file. |
return | error code, if load success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err load(const std::string &model_path)
type
Model type, string type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string type
items
Model config items, different model type has different config items
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::map<std::string, std::map<std::string, std::string>> items
model_path
Model path
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string model_path
parse_labels
def parse_labels(self, key: str = 'labels') -> list[str]
Please load() first, parse labels in items["extra"]["labels"],\nif items["extra"]["labels"] is a file path: will parse file, every one line is a label;\nif items["extra"]["labels"] is a string, labels split by comma(",").\nExecute this method will replace items["extra"]["labels"];
item | description |
---|---|
type | func |
param | key: parse from items[key], default "labels". |
return | parsed labels list. |
static | False |
C++ defination code:
std::vector<std::string> parse_labels(const std::string key = "labels")
LayerInfo
NN model layer info
C++ defination code:
class LayerInfo
__init__
def __init__(self, name: str = '', dtype: maix.tensor.DType = ..., shape: list[int] = []) -> None
LayerInfo constructor
item | description |
---|---|
type | func |
param | name: direction [in], layer name dtype: direction [in], layer data type shape: direction [in], layer shape |
static | False |
C++ defination code:
LayerInfo(const std::string &name = "", tensor::DType dtype = tensor::DType::FLOAT32, std::vector<int> shape = std::vector<int>())
name
Layer name
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string name
dtype
Layer data type
item | description |
---|---|
type | var |
attention | If model is quantized, this is the real quantized data type like int8 float16, in most scene, inputs and outputs we actually use float32 in API like forward. |
static | False |
readonly | False |
C++ defination code:
tensor::DType dtype
shape
Layer shape
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> shape
layout
If layer layout, if layout is chw, typically the shape is [N, C, H, W], hwc is [N, H, W, C].
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
nn::Layout layout = nn::Layout::UNKNOWN
shape_int
def shape_int(self) -> int
Shape as one int type, multiply all dims of shape
item | description |
---|---|
type | func |
static | False |
C++ defination code:
int shape_int()
to_str
def to_str(self) -> str
To string
item | description |
---|---|
type | func |
static | False |
C++ defination code:
std::string to_str()
__str__
def __str__(self) -> str
To string
item | description |
---|---|
type | func |
static | False |
C++ defination code:
std::string __str__()
NN
Neural network class
C++ defination code:
class NN
__init__
def __init__(self, model: str = '', dual_buff: bool = False) -> None
Neural network constructor
item | description |
---|---|
type | func |
param | model: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default false to ensure easy use. |
static | False |
C++ defination code:
NN(const std::string &model = "", bool dual_buff = false)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: direction [in], model file path, model format can be MUD(model universal describe file) file. |
return | error code, if load success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err load(const std::string &model)
loaded
def loaded(self) -> bool
Is model loaded
item | description |
---|---|
type | func |
return | true if model loaded, else false |
static | False |
C++ defination code:
bool loaded()
set_dual_buff
def set_dual_buff(self, enable: bool) -> None
Enable dual buff or disable dual buff
item | description |
---|---|
type | func |
param | enable: true to enable, false to disable |
static | False |
C++ defination code:
void set_dual_buff(bool enable)
inputs_info
def inputs_info(self) -> list[LayerInfo]
Get model input layer info
item | description |
---|---|
type | func |
return | input layer info |
static | False |
C++ defination code:
std::vector<nn::LayerInfo> inputs_info()
outputs_info
def outputs_info(self) -> list[LayerInfo]
Get model output layer info
item | description |
---|---|
type | func |
return | output layer info |
static | False |
C++ defination code:
std::vector<nn::LayerInfo> outputs_info()
extra_info
def extra_info(self) -> dict[str, str]
Get model extra info define in MUD file
item | description |
---|---|
type | func |
return | extra info, dict type, key-value object, attention: key and value are all string type. |
static | False |
C++ defination code:
std::map<std::string, std::string> extra_info()
extra_info_labels
def extra_info_labels(self) -> list[str]
Get model parsed extra info labels define in MUD file
item | description |
---|---|
type | func |
return | labels list in extra info, string list type. |
static | False |
C++ defination code:
std::vector<std::string> extra_info_labels()
forward
def forward(self, inputs: maix.tensor.Tensors, copy_result: bool = True, dual_buff_wait: bool = False) -> maix.tensor.Tensors
forward run model, get output of model,\nthis is specially for MaixPy, not efficient, but easy to use in MaixPy
item | description |
---|---|
type | func |
param | input: direction [in], input tensor copy_result: If set true, will copy result to a new variable; else will use a internal memory, you can only use it until to the next forward. Default true to avoid problems, you can set it to false manually to make speed faster. dual_buff_wait: bool type, only for dual_buff mode, if true, will inference this image and wait for result, default false. |
return | output tensor. In C++, you should manually delete tensors in return value and return value. If dual_buff mode, it can be NULL(None in MaixPy) means not ready. |
throw | if error ocurrs like no memory or arg error, will raise err.Exception. |
static | False |
C++ defination code:
tensor::Tensors *forward(tensor::Tensors &inputs, bool copy_result = true, bool dual_buff_wait = false)
forward_image
def forward_image(self, img: maix.image.Image, mean: list[float] = [], scale: list[float] = [], fit: maix.image.Fit = ..., copy_result: bool = True, dual_buff_wait: bool = False, chw: bool = True) -> maix.tensor.Tensors
forward model, param is image
item | description |
---|---|
type | func |
param | img: input image mean: mean value, a list type, e.g. [0.485, 0.456, 0.406], default is empty list means not normalize. scale: scale value, a list type, e.g. [1/0.229, 1/0.224, 1/0.225], default is empty list means not normalize. fit: fit mode, if the image size of input not equal to model's input, it will auto resize use this fit method, default is image.Fit.FIT_FILL for easy coordinate calculation, but for more accurate result, use image.Fit.FIT_CONTAIN is better. copy_result: If set true, will copy result to a new variable; else will use a internal memory, you can only use it until to the next forward. Default true to avoid problems, you can set it to false manually to make speed faster. dual_buff_wait: bool type, only for dual_buff mode, if true, will inference this image and wait for result, default false. chw: !!depracated!! This arg will be ignored!!! Please set extra.input_layout in mud file instead. chw channel format, forward model with hwc format image input if set to false, default true(chw). |
return | output tensor. In C++, you should manually delete tensors in return value and return value. If dual_buff mode, it can be NULL(None in MaixPy) means not ready. |
throw | If error occurs, like arg error or alloc memory failed, will raise err.Exception. |
static | False |
C++ defination code:
tensor::Tensors *forward_image(image::Image &img, std::vector<float> mean = std::vector<float>(), std::vector<float> scale = std::vector<float>(), image::Fit fit = image::Fit::FIT_FILL, bool copy_result = true, bool dual_buff_wait = false, bool chw = true)
FaceDetector
FaceDetector class
C++ defination code:
class FaceDetector
__init__
def __init__(self, model: str = '', dual_buff: bool = True) -> None
Constructor of FaceDetector class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
FaceDetector(const string &model = "", bool dual_buff = true)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
detect
def detect(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, fit: maix.image.Fit = ...) -> list[Object]
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Confidence threshold, default 0.5. iou_th: IoU threshold, default 0.45. fit: Resize method, default image.Fit.FIT_CONTAIN. |
throw | If image format not match model input format, will throw err::Exception. |
return | Object list. In C++, you should delete it after use. |
static | False |
C++ defination code:
std::vector<nn::Object> *detect(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, maix::image::Fit fit = maix::image::FIT_CONTAIN)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
NanoTrack
NanoTrack class
C++ defination code:
class NanoTrack
__init__
def __init__(self, model: str = '') -> None
Constructor of NanoTrack class
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
NanoTrack(const string &model = "")
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
init
def init(self, img: maix.image.Image, x: int, y: int, w: int, h: int) -> None
Init tracker, give tacker first target image and target position.
item | description |
---|---|
type | func |
param | img: Image want to detect, target should be in this image. x: the target position left top coordinate x. y: the target position left top coordinate y. w: the target width. h: the target height. |
throw | If image format not match model input format, will throw err::Exception. |
static | False |
C++ defination code:
void init(image::Image &img, int x, int y, int w, int h)
track
def track(self, img: maix.image.Image, threshold: float = 0.9) -> Object
Track object acoording to last object position and the init function learned target feature.
item | description |
---|---|
type | func |
param | img: image to detect object and track, can be any resolution, before detect it will crop a area according to last time target's position. threshold: If score < threshold, will see this new detection is invalid, but remain return this new detecion, default 0.9. |
return | object, position and score, and detect area in points's first 4 element(x, y, w, h, center_x, center_y, input_size, target_size) |
static | False |
C++ defination code:
nn::Object track(image::Image &img, float threshold = 0.9)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
Speech
Speech
C++ defination code:
class Speech
__init__
def __init__(self, model: str = '') -> None
Construct a new Speech object
item | description |
---|---|
type | func |
param | model: model path, default empty, you can load model later by load function. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
Speech(const string &model = "")
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: Model path want to load |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &model)
init
def init(self, dev_type: SpeechDevice, device_name: str = '') -> maix.err.Err
Init the ASR library and select the type and name of the audio device.
item | description |
---|---|
type | func |
param | dev_type: device type want to detect, can choose between WAV, PCM, or MIC. device_name: device name want to detect, can choose a WAV file, a PCM file, or a MIC device name. |
throw | 1. If am model is not loaded, will throw err::ERR_NOT_IMPL. 2. If device is not supported, will throw err::ERR_NOT_IMPL. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err init(nn::SpeechDevice dev_type, const string &device_name = "")
devive
def devive(self, dev_type: SpeechDevice, device_name: str) -> maix.err.Err
Reset the device, usually used for PCM/WAV recognition,\nsuch as identifying the next WAV file.
item | description |
---|---|
type | func |
param | dev_type: device type want to detect, can choose between WAV, PCM, or MIC. device_name: device name want to detect, can choose a WAV file, a PCM file, or a MIC device name. |
throw | If device is not supported, will throw err::ERR_NOT_IMPL. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err devive(nn::SpeechDevice dev_type, const string &device_name)
dec_deinit
def dec_deinit(self, decoder: SpeechDecoder) -> None
Deinit the decoder.
item | description |
---|---|
type | func |
param | decoder: decoder type want to deinit can choose between DECODER_RAW, DECODER_DIG, DECODER_LVCSR, DECODER_KWS or DECODER_ALL. |
throw | If device is not supported, will throw err::ERR_NOT_IMPL. |
static | False |
C++ defination code:
void dec_deinit(nn::SpeechDecoder decoder)
raw
def raw(self, callback: typing.Callable[[list[tuple[int, float]], int], None]) -> maix.err.Err
Init raw decoder, it will output the prediction results of the original AM.
item | description |
---|---|
type | func |
param | callback: raw decoder user callback. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err raw(std::function<void(std::vector<std::pair<int, float>>, int)> callback)
digit
def digit(self, blank: int, callback: typing.Callable[[str, int], None]) -> maix.err.Err
Init digit decoder, it will output the Chinese digit recognition results within the last 4 seconds.
item | description |
---|---|
type | func |
param | blank: If it exceeds this value, insert a '_' in the output result to indicate idle mute. callback: digit decoder user callback. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err digit(int blank, std::function<void(char*, int)> callback)
kws
def kws(self, kw_tbl: list[str], kw_gate: list[float], callback: typing.Callable[[list[float], int], None], auto_similar: bool = True) -> maix.err.Err
Init kws decoder, it will output a probability list of all registered keywords in the latest frame,\nusers can set their own thresholds for wake-up.
item | description |
---|---|
type | func |
param | kw_tbl: Keyword list, filled in with spaces separated by pinyin, for example: xiao3 ai4 tong2 xue2 kw_gate: kw_gate, keyword probability gate table, the number should be the same as kw_tbl auto_similar: Whether to perform automatic homophone processing, setting it to true will automatically calculate the probability by using pinyin with different tones as homophones callback: digit decoder user callback. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err kws(std::vector<string> kw_tbl, std::vector<float> kw_gate, std::function<void(std::vector<float>, int)> callback, bool auto_similar = true)
lvcsr
def lvcsr(self, sfst_name: str, sym_name: str, phones_txt: str, words_txt: str, callback: typing.Callable[[tuple[str, str], int], None], beam: float = 8, bg_prob: float = 10, scale: float = 0.5, mmap: bool = False) -> maix.err.Err
Init lvcsr decoder, it will output continuous speech recognition results (less than 1024 Chinese characters).
item | description |
---|---|
type | func |
param | sfst_name: Sfst file path. sym_name: Sym file path (output symbol table). phones_txt: Path to phones.bin (pinyin table). words_txt: Path to words.bin (dictionary table). callback: lvcsr decoder user callback. beam: The beam size for WFST search is set to 8 by default, and it is recommended to be between 3 and 9. The larger the size, the larger the search space, and the more accurate but slower the search. bg_prob: The absolute value of the natural logarithm of the default probability value for background pinyin outside of BEAM-CNT is set to 10 by default. scale: acoustics_cost = log(pny_prob)scale. mmap*: use mmap to load the WFST decoding image, If set to true, the beam should be less than 5. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err lvcsr(const string &sfst_name, const string &sym_name, const string &phones_txt, const string &words_txt, std::function<void(std::pair<char*, char*>, int)> callback, float beam = 8, float bg_prob = 10, float scale = 0.5, bool mmap = false)
run
def run(self, frame: int) -> int
Run speech recognition, user can run 1 frame at a time and do other processing after running,\nor it can run continuously within a thread and be stopped by an external thread.
item | description |
---|---|
type | func |
param | frame: The number of frames per run. |
return | int type, return actual number of frames in the run. |
static | False |
C++ defination code:
int run(int frame)
clear
def clear(self) -> None
Reset internal cache operation
item | description |
---|---|
type | func |
static | False |
C++ defination code:
void clear()
frame_time
def frame_time(self) -> int
Get the time of one frame.
item | description |
---|---|
type | func |
return | int type, return the time of one frame. |
static | False |
C++ defination code:
int frame_time()
similar
def similar(self, pny: str, similar_pnys: list[str]) -> maix.err.Err
Manually register mute words, and each pinyin can register up to 10 homophones,\nplease note that using this interface to register homophones will overwrite,\nthe homophone table automatically generated in the "automatic homophone processing" feature.
item | description |
---|---|
type | func |
param | dev_type: device type want to detect, can choose between WAV, PCM, or MIC. device_name: device name want to detect, can choose a WAV file, a PCM file, or a MIC device name. |
return | err::Err type, if init success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err similar(const string &pny, std::vector<std::string> similar_pnys)
skip_frames
def skip_frames(self, num: int) -> None
Run some frames and drop, this can be used to avoid\nincorrect recognition results when switching decoders.
item | description |
---|---|
type | func |
param | num: number of frames to run and drop |
static | False |
C++ defination code:
void skip_frames(int num)
mean
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean
scale
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale
dev_type
def dev_type(self) -> SpeechDevice
get device type
item | description |
---|---|
type | func |
return | nn::SpeechDevice type, see SpeechDevice of this module |
static | False |
C++ defination code:
nn::SpeechDevice dev_type()
FaceObject
Face object
C++ defination code:
class FaceObject
__init__
def __init__(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], feature: list[float] = [], face: maix.image.Image = ...) -> None
Constructor
item | description |
---|---|
type | func |
static | False |
C++ defination code:
FaceObject(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), std::vector<float> feature = std::vector<float>(), image::Image face = image::Image())
__str__
def __str__(self) -> str
FaceObject info to string
item | description |
---|---|
type | func |
return | FaceObject info string |
static | False |
C++ defination code:
std::string to_str()
x
FaceObject left top coordinate x
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int x
y
FaceObject left top coordinate y
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int y
w
FaceObject width
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int w
h
FaceObject height
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int h
class_id
FaceObject class id
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int class_id
score
FaceObject score
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float score
points
keypoints
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<int> points
feature
feature, float list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> feature
face
face image
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
image::Image face
FaceObjects
Objects Class for detect result
C++ defination code:
class FaceObjects
__init__
def __init__(self) -> None
Constructor of FaceObjects class
item | description |
---|---|
type | func |
static | False |
C++ defination code:
FaceObjects()
add
def add(self, x: int = 0, y: int = 0, w: int = 0, h: int = 0, class_id: int = 0, score: float = 0, points: list[int] = [], feature: list[float] = [], face: maix.image.Image = ...) -> FaceObject
Add object to FaceObjects
item | description |
---|---|
type | func |
throw | Throw exception if no memory |
static | False |
C++ defination code:
nn::FaceObject &add(int x = 0, int y = 0, int w = 0, int h = 0, int class_id = 0, float score = 0, std::vector<int> points = std::vector<int>(), std::vector<float> feature = std::vector<float>(), image::Image face = image::Image())
remove
def remove(self, idx: int) -> maix.err.Err
Remove object form FaceObjects
item | description |
---|---|
type | func |
static | False |
C++ defination code:
err::Err remove(int idx)
at
def at(self, idx: int) -> FaceObject
Get object item
item | description |
---|---|
type | func |
static | False |
C++ defination code:
nn::FaceObject &at(int idx)
__getitem__
def __getitem__(self, idx: int) -> FaceObject
Get object item
item | description |
---|---|
type | func |
static | False |
C++ defination code:
nn::FaceObject &operator[](int idx)
__len__
def __len__(self) -> int
Get size
item | description |
---|---|
type | func |
static | False |
C++ defination code:
size_t size()
__iter__
def __iter__(self) -> typing.Iterator
Begin
item | description |
---|---|
type | func |
static | False |
C++ defination code:
std::vector<FaceObject*>::iterator begin()
FaceRecognizer
FaceRecognizer class
C++ defination code:
class FaceRecognizer
__init__
def __init__(self, detect_model: str = '', feature_model: str = '', dual_buff: bool = True) -> None
Constructor of FaceRecognizer class
item | description |
---|---|
type | func |
param | detect_model: face detect model path, default empty, you can load model later by load function. feature_model: feature extract model dual_buff: direction [in], prepare dual input output buffer to accelarate forward, that is, when NPU is forwarding we not wait and prepare the next input buff. If you want to ensure every time forward output the input's result, set this arg to false please. Default true to ensure speed. |
throw | If model arg is not empty and load failed, will throw err::Exception. |
static | False |
C++ defination code:
FaceRecognizer(const string &detect_model = "", const string &feature_model = "", bool dual_buff = true)
load
def load(self, detect_model: str, feature_model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | detect_model: face detect model path, default empty, you can load model later by load function. feature_model: feature extract model |
return | err::Err |
static | False |
C++ defination code:
err::Err load(const string &detect_model, const string &feature_model)
recognize
def recognize(self, img: maix.image.Image, conf_th: float = 0.5, iou_th: float = 0.45, compare_th: float = 0.8, get_feature: bool = False, get_face: bool = False, fit: maix.image.Fit = ...) -> FaceObjects
Detect objects from image
item | description |
---|---|
type | func |
param | img: Image want to detect, if image's size not match model input's, will auto resize with fit method. conf_th: Detect confidence threshold, default 0.5. iou_th: Detect IoU threshold, default 0.45. compare_th: Compare two face score threshold, default 0.8, if two faces' score < this value, will see this face fas unknown. get_feature: return feature or not, if true will copy features to result, if false will not copy feature to result to save time and memory. get_face: return face image or not, if true result object's face attribute will valid, or face sttribute is empty. Get face image will alloc memory and copy image, so will lead to slower speed. fit: Resize method, default image.Fit.FIT_CONTAIN. |
throw | If image format not match model input format, will throw err::Exception. |
return | FaceObjects object. In C++, you should delete it after use. |
static | False |
C++ defination code:
nn::FaceObjects *recognize(image::Image &img, float conf_th = 0.5, float iou_th = 0.45, float compare_th = 0.8, bool get_feature = false, bool get_face = false, maix::image::Fit fit = maix::image::FIT_CONTAIN)
add_face
def add_face(self, face: FaceObject, label: str) -> maix.err.Err
Add face to lib
item | description |
---|---|
type | func |
param | face: face object, find by recognize label: face label(name) |
static | False |
C++ defination code:
err::Err add_face(nn::FaceObject *face, const std::string &label)
remove_face
def remove_face(self, idx: int = -1, label: str = '') -> maix.err.Err
remove face from lib
item | description |
---|---|
type | func |
param | idx: index of face in lib, default -1 means use label, value [0,face_num), idx and label must have one, idx have high priotiry. label: which face to remove, default to empty string mean use idx, idx and label must have one, idx have high priotiry. |
static | False |
C++ defination code:
err::Err remove_face(int idx = -1, const std::string &label = "")
save_faces
def save_faces(self, path: str) -> maix.err.Err
Save faces info to a file
item | description |
---|---|
type | func |
param | path: where to save, string type. |
return | err.Err type |
static | False |
C++ defination code:
err::Err save_faces(const std::string &path)
load_faces
def load_faces(self, path: str) -> maix.err.Err
Load faces info from a file
item | description |
---|---|
type | func |
param | path: from where to load, string type. |
return | err::Err type |
static | False |
C++ defination code:
err::Err load_faces(const std::string &path)
input_size
def input_size(self) -> maix.image.Size
Get model input size
item | description |
---|---|
type | func |
return | model input size |
static | False |
C++ defination code:
image::Size input_size()
input_width
def input_width(self) -> int
Get model input width
item | description |
---|---|
type | func |
return | model input size of width |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Get model input height
item | description |
---|---|
type | func |
return | model input size of height |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Get input image format
item | description |
---|---|
type | func |
return | input image format, image::Format type. |
static | False |
C++ defination code:
image::Format input_format()
mean_detector
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean_detector
scale_detector
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale_detector
mean_feature
Get mean value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> mean_feature
scale_feature
Get scale value, list type
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<float> scale_feature
labels
labels, list type, first is "unknown"
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<std::string> labels
features
features
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::vector<std::vector<float>> features
QwenResp
Qwen model response
C++ defination code:
class QwenResp
msg
Model response full message.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string msg
msg_new
Model response new message.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string msg_new
err_code
Model response error code, maix.Err type, should be err.Err.ERR_NONE if no error.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
err::Err err_code
err_msg
Model response error message.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string err_msg
QwenPostConfig
Qwen model post config
C++ defination code:
class QwenPostConfig
enable_temperature
Enable temperature sampling
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_temperature
temperature
Temperature sampling value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float temperature
enable_repetition_penalty
Enable repetition penalty
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_repetition_penalty
repetition_penalty
Repetition penalty value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float repetition_penalty
penalty_window
Repetition penalty window
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int penalty_window
enable_top_p_sampling
Enable diversity penalty
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_top_p_sampling
top_p
Diversity penalty value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float top_p
enable_top_k_sampling
Enable top k sampling
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_top_k_sampling
top_k
Top k sampling value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int top_k
Qwen
Qwen model
C++ defination code:
class Qwen
__init__
def __init__(self, model: str) -> None
Qwen constructor
item | description |
---|---|
type | func |
param | model: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function. |
static | False |
C++ defination code:
Qwen(const std::string &model)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: direction [in], model file path, model format can be MUD(model universal describe file) file. |
return | error code, if load success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err load(const std::string &model)
unload
def unload(self) -> maix.err.Err
Unload model
item | description |
---|---|
type | func |
return | error code, if unload success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err unload()
loaded
def loaded(self) -> bool
Is model loaded
item | description |
---|---|
type | func |
return | true if model loaded, else false |
static | False |
C++ defination code:
bool loaded()
set_system_prompt
def set_system_prompt(self, prompt: str) -> None
Set system prompt, will auto call clear_context.
item | description |
---|---|
type | func |
param | prompt: system prompt |
static | False |
C++ defination code:
void set_system_prompt(const std::string &prompt)
get_system_prompt
def get_system_prompt(self) -> str
Get system prompt
item | description |
---|---|
type | func |
return | system prompt |
static | False |
C++ defination code:
std::string get_system_prompt()
set_log_level
def set_log_level(self, level: ..., color: bool) -> None
Set log level
item | description |
---|---|
type | func |
param | level: log level, @see maix.log.LogLevel color: true to enable color, false to disable color |
static | False |
C++ defination code:
void set_log_level(log::LogLevel level, bool color)
set_reply_callback
def set_reply_callback(self, callback: typing.Callable[[Qwen, QwenResp], None] = None) -> None
Set reply callback
item | description |
---|---|
type | func |
param | callback: reply callback, when token(words) generated, this function will be called, so you can get response message in real time in this callback funtion. If set to None(nullptr in C++), you can get response after all response message generated. |
static | False |
C++ defination code:
void set_reply_callback(std::function<void(nn::Qwen &, const nn::QwenResp &)> callback = nullptr)
get_reply_callback
def get_reply_callback(self) -> typing.Callable[[Qwen, QwenResp], None]
Get reply callback
item | description |
---|---|
type | func |
return | reply callback |
static | False |
C++ defination code:
std::function<void(nn::Qwen &, const nn::QwenResp &)> get_reply_callback()
send
def send(self, msg: str) -> QwenResp
Send message to model
item | description |
---|---|
type | func |
param | msg: message to send |
return | model response |
static | False |
C++ defination code:
nn::QwenResp send(const std::string &msg)
cancel
def cancel(self) -> None
Cancel running
item | description |
---|---|
type | func |
static | False |
C++ defination code:
void cancel()
clear_context
def clear_context(self) -> maix.err.Err
Clear context
item | description |
---|---|
type | func |
return | error code, if clear success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err clear_context()
version
def version(self) -> str
Get model version
item | description |
---|---|
type | func |
return | model version |
static | False |
C++ defination code:
std::string version()
post_config
Qwen post config, default will read config from model mud file, you can also set it manually here.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
nn::QwenPostConfig post_config
InternVLResp
InternVL model response
C++ defination code:
class InternVLResp
msg
Model response full message.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string msg
msg_new
Model response new message.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string msg_new
err_code
Model response error code, maix.Err type, should be err.Err.ERR_NONE if no error.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
err::Err err_code
err_msg
Model response error message.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
std::string err_msg
InternVLPostConfig
InternVL model post config
C++ defination code:
class InternVLPostConfig
enable_temperature
Enable temperature sampling
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_temperature
temperature
Temperature sampling value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float temperature
enable_repetition_penalty
Enable repetition penalty
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_repetition_penalty
repetition_penalty
Repetition penalty value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float repetition_penalty
penalty_window
Repetition penalty window
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int penalty_window
enable_top_p_sampling
Enable diversity penalty
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_top_p_sampling
top_p
Diversity penalty value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
float top_p
enable_top_k_sampling
Enable top k sampling
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
bool enable_top_k_sampling
top_k
Top k sampling value
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
int top_k
InternVL
InternVL model
C++ defination code:
class InternVL
__init__
def __init__(self, model: str) -> None
InternVL constructor
item | description |
---|---|
type | func |
param | model: direction [in], model file path, model format can be MUD(model universal describe file) file. If model_path set, will load model from file, load failed will raise err.Exception. If model_path not set, you can load model later by load function. |
static | False |
C++ defination code:
InternVL(const std::string &model)
load
def load(self, model: str) -> maix.err.Err
Load model from file
item | description |
---|---|
type | func |
param | model: direction [in], model file path, model format can be MUD(model universal describe file) file. |
return | error code, if load success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err load(const std::string &model)
unload
def unload(self) -> maix.err.Err
Unload model
item | description |
---|---|
type | func |
return | error code, if unload success, return err::ERR_NONE |
static | False |
C++ defination code:
err::Err unload()
loaded
def loaded(self) -> bool
Is model loaded
item | description |
---|---|
type | func |
return | true if model loaded, else false |
static | False |
C++ defination code:
bool loaded()
set_system_prompt
def set_system_prompt(self, prompt: str) -> None
Set system prompt
item | description |
---|---|
type | func |
param | prompt: system prompt |
static | False |
C++ defination code:
void set_system_prompt(const std::string &prompt)
get_system_prompt
def get_system_prompt(self) -> str
Get system prompt
item | description |
---|---|
type | func |
return | system prompt |
static | False |
C++ defination code:
std::string get_system_prompt()
set_log_level
def set_log_level(self, level: ..., color: bool) -> None
Set log level
item | description |
---|---|
type | func |
param | level: log level, @see maix.log.LogLevel color: true to enable color, false to disable color |
static | False |
C++ defination code:
void set_log_level(log::LogLevel level, bool color)
set_reply_callback
def set_reply_callback(self, callback: typing.Callable[[InternVL, InternVLResp], None] = None) -> None
Set reply callback.
item | description |
---|---|
type | func |
param | callback: reply callback, when token(words) generated, this function will be called, so you can get response message in real time in this callback funtion. If set to None(nullptr in C++), you can get response after all response message generated. |
static | False |
C++ defination code:
void set_reply_callback(std::function<void(nn::InternVL &, const nn::InternVLResp &)> callback = nullptr)
get_reply_callback
def get_reply_callback(self) -> typing.Callable[[InternVL, InternVLResp], None]
Get reply callback
item | description |
---|---|
type | func |
return | reply callback |
static | False |
C++ defination code:
std::function<void(nn::InternVL &, const nn::InternVLResp &)> get_reply_callback()
input_width
def input_width(self) -> int
Image input width
item | description |
---|---|
type | func |
return | input width. |
static | False |
C++ defination code:
int input_width()
input_height
def input_height(self) -> int
Image input height
item | description |
---|---|
type | func |
return | input height. |
static | False |
C++ defination code:
int input_height()
input_format
def input_format(self) -> maix.image.Format
Image input format
item | description |
---|---|
type | func |
return | input format. |
static | False |
C++ defination code:
maix::image::Format input_format()
set_image
def set_image(self, img: maix.image.Image, fit: maix.image.Fit = ...) -> maix.err.Err
Set image and will encode image.\nYou can set image once and call send multiple times.
item | description |
---|---|
type | func |
param | img: the image you want to use. fit: Image resize fit method, only used when img size not equal to model input. |
return | err.Err return err.Err.ERR_NONE is no error happen. |
static | False |
C++ defination code:
err::Err set_image(maix::image::Image &img, maix::image::Fit fit = maix::image::Fit::FIT_CONTAIN)
clear_image
def clear_image(self) -> None
Clear image, InternVL2.5 based on Qwen2.5, so you can clear image and only use LLM function.
item | description |
---|---|
type | func |
static | False |
C++ defination code:
void clear_image()
is_image_set
def is_image_set(self) -> bool
Whether image set by set_image
item | description |
---|---|
type | func |
return | Return true if image set by set_image function, or return false. |
static | False |
C++ defination code:
bool is_image_set()
send
def send(self, msg: str) -> InternVLResp
Send message to model
item | description |
---|---|
type | func |
param | msg: message to send |
return | model response |
static | False |
C++ defination code:
nn::InternVLResp send(const std::string &msg)
cancel
def cancel(self) -> None
Cancel running
item | description |
---|---|
type | func |
static | False |
C++ defination code:
void cancel()
version
def version(self) -> str
Get model version
item | description |
---|---|
type | func |
return | model version |
static | False |
C++ defination code:
std::string version()
post_config
InternVL post config, default will read config from model mud file, you can also set it manually here.
item | description |
---|---|
type | var |
static | False |
readonly | False |
C++ defination code:
nn::InternVLPostConfig post_config