天呐，我居然可以隔空作画了

作者：华为云开发者联盟

2022 年 9 月 05 日
广东
本文字数：7998 字
阅读完需：约 26 分钟

本文分享自华为云社区《ModelBox开发案例 - 隔空作画》，作者：吴小鱼。

本案例将使用 YOLOX、SCNet 两个模型，实现一个简单的隔空作画趣味应用，最终效果如下所示：

案例所需资源（代码、模型、测试数据等）均可从 obs 桶下载。

模型训练

我们使用面向开发者的一站式 AI 开发平台 ModelArts 进行模型的训练：

ModelArts 提供了包括数据标注，训练环境，预置算法在内的丰富的功能，甚至可以通过订阅预置算法实现 0 代码的模型训练工作。当然你也可以在本地训练自己的模型。我们假设你现在已经拥有了训练好的模型，接下来我们需要将训练好的模型转换成为可以在开发板上运行的模型。

模型转换

我们发布了开发板模型转换案例，参见RK3568模型转换验证案例：

在这个案例中我们演示了从环境适配到模型的转换验证的全流程样例代码，开发者可以通过“Run in ModelArts”一键将 Notebook 案例在 ModelArts 控制台快速打开、运行以及进行二次开发等操作。

开发环境部署

使用开发板进行 ModelBox AI 应用开发有两种方式，一是开发板连接显示器和键盘鼠标，安装 Ubuntu 桌面，直接在开发板上进行开发;二是使用远程连接工具（如 VS Code 中的 Remote-SSH）从 PC 端登录开发板进行开发。这里我们推荐第二种方式，因为 PC 端可以使用功能更丰富、界面更友好的 IDE。

1.配置网络

PC 连接开发板需要知道开发板的 ip，但是开发板默认没有固定 ip，我们提供了 ModelBox PC Tool，可以自动为开发板配置 ip，也可以在推理阶段很方便的进行视频推流拉流。

PC Tool 位于 SDK 的 connect_wizard 目录中：

双击 connect_wizard.exe，在页面中可以看到有两种开发板连接方式，我们使用网线连接开发板的方式：

按照指引断开或连接网线：

等待一小段时间，可以看到来到了第三步，此时开发板已设置为默认 ip：192.168.2.111，PC 使用该 ip 即可 SSH 登录：

2. 远程连接开发板

我们推荐在 PC 端使用 VS Code 远程连接开发板来对设备操作。

使用 VS Code 连接开发板可以参考我们发布的ModelBox 端云协同AI开发套件（RK3568）上手指南。同时，上手指南也介绍了如何将开发板注册到 HiLens 管理控制台进行更方便的在线管理。

应用开发

接下来我们会以隔空作画 demo 为例，介绍如何使用 ModelBox 开发一个 AI 应用。

1.创建工程

SDK 提供了工程脚本 create.py，可以使用./create.py -h 查看脚本帮助：

ModelBox 提供了可视化图编排工具：Editor，可以使用./create.py -t editor 开启图编排服务：

服务默认 ip 即为 192.168.2.111，如需配置其他 ip 或端口，可以通过-i ip:port 参数进行配置。

点击链接即可进入可视化编辑界面，我们点击编排进入工程开发界面，如果进一步了解 ModelBox 相关内容，可以点击右上角帮助：

进入编排界面，点击右上角新建项目：

项目路径填写 workspace，项目名称填写 hand_painting，确认：

可以看到我们已经拥有了一个带有 http 收发单元的默认图：

其中，区域 1 为 SDK 预置的高性能通用流单元，区域 2 为可视化编排界面，区域 3 为对应的图配置文件内容。同时，VS Code 对应目录下也出现了 hand_painting 项目：

2.创建推理功能单元

接下来，我们创建推理流单元：

对于手检测模型，我们将流单元命名为 hand_detection，模型文件名即为转换好的检测模型名：yolox_hand.rknn，此模型输入为 image，输出为 feature map，所以我们添加 int 类型的输入端口与 float 类型的输出端口。关于开发板的推理流单元创建，在处理类型时我们选择 cuda，即为 npu 推理，推理引擎可选任意一款，目前开发板 SDK 可以自动进行识别转换。最后将功能单元分组修改为 inference，点击确认，即可看到，在右侧 ference 页签下出现了：

同时，在 VS Code 工程 model 目录下可以看到创建好的推理流单元：

同样的，我们创建 pose_detection 推理流单元：

3.创建后处理功能单元

除了推理流单元外，隔空作画 demo 还需要一些通用功能单元：检测后处理单元、感兴趣区域提取单元、作画单元，我们新建三个 python 功能单元来满足上述需求。

对于检测后处理单元，我们希望通过原图和 hand_detection 的输出解码出手检测框，所以该单元应该有两个输入。此外，对于画幅中有手或者没有检测到手两种状态，我们希望该功能单元分情况处理，检测到手时，将检测结果送入感兴趣区域提取单元，没有检测到手时，直接返回，因此功能单元类型选择：IF_ELSE。新建单元如下：

同样的，根据输入输出与功能单元状态，我们创建 extract_roi 和 painting 两个功能单元：

4.流程图编排

拖拽

需要的功能单元全部创建好后，我们可以着手编排流程图，我们编排一个视频处理的图，暂时不需要 http 收发单元，可以删除不需要的单元：

在 Generic 列表下将虚拟输入单元 input 和我们刚刚创建的三个功能单元拖入画布：

在 Image 列表下将模型推理需要用到的预处理单元 resize 拖入画布，因为我们需要两个 resize 单元，所以重复拖入：

值得注意的是，resize 单元需要配置参数，需要点击该单元进行配置：

在 Input 列表下拖入输入解析单元 data_source_parser：

在 Video 列表下拖入视频处理需要的单元 video_demuxer、video_decoder、video_out：

最后，在 Inference 列表下拖入我们创建的两个推理单元：

编排

将功能单元按照处理逻辑进行连接：虚拟输入 input 连接输入解析 data_source_parser，解析后送入视频解包与解码单元：

解码输出送入预处理后可直接进行推理：

推理后处理需要输入原图与推理结果，没有结果则直接连接视频输入单元，有结果则连接感兴趣区域提取单元：

提取结果送入预处理与推理：

最后，得到的关键点结果与原图送入作画单元，作画结果送入视频输出单元进行保存：

这样，我们就完成了流程图的编排，可以看到在 GraphViz 区域也出现了完整的图表述：

保存项目，转到 VS Code 进行每个单元的代码实现：

5.代码补全

可视化编排中，创建的推理单元位于项目的 model 目录下，通用单元位于 etc/flowunit 目录下，流程图位于 graph 目录下，可以看到创建的单元与图都已同步过来：

其中，video_decoder 需要指定类型：

video_decoder7 [ type=flowunit flowunit=video_decoder device=rknpu, deviceid="0", pix_fmt=bgr label="{{<in_video_packet> in_video_packet}|video_decoder7|{<out_video_frame> out_video_frame}}" ]

复制代码

推理单元

首先完善推理单元，对于推理功能单元，只需要提供独立的 toml 配置文件，指定推理功能单元的基本属性即可，目录结构为：

[flowunit-name] |---[flowunit-name].toml #推理功能单元配置 |---[model].rknn #模型文件 |---[infer-plugin].so       #推理自定义插件ModelBox框架在初始化时，会扫描目录中的toml后缀的文件，并读取相关的推理功能单元信息。[infer-plugin].so是推理所需插件，推理功能单元支持加载自定义插件，开发者可以实现自定义算子。

复制代码

将模型拷入对应文件夹，以 hand_detection 为例我们看一下推理功能单元配置文件：

配置文件中有一些单元类型、模型名称、输入输出的基本配置，可以酌情修改。

通用单元

Python 通用单元需要提供独立的 toml 配置文件，指定 python 功能单元的基本属性。一般情况，目录结构为：

[FlowUnitName] |---[FlowUnitName].toml |---[FlowUnitName].py |---xxx.py

复制代码

相较于推理单元而言，通用单元不但有配置文件，还需要完善具体的功能代码，以 yolox_post 为例，首先是功能单元配置文件：

# Copyright (c) Huawei Technologies Co., Ltd. 2022. All rights reserved.# Basic config[base]name = "yolox_post" # The FlowUnit namedevice = "cpu" # The flowunit runs on cpuversion = "1.0.0" # The version of the flowunittype = "python" # Fixed value, do not changedescription = "description" # The description of the flowunitentry = "yolox_post@yolox_postFlowUnit" # Python flowunit entry functiongroup_type = "generic"  # flowunit group attribution, change as input/output/image ...# Flowunit Typestream = false # Whether the flowunit is a stream flowunitcondition = true # Whether the flowunit is a condition flowunitcollapse = false # Whether the flowunit is a collapse flowunitcollapse_all = false # Whether the flowunit will collapse all the dataexpand = false #  Whether the flowunit is a expand flowunit# The default Flowunit config[config]item = "value"# Input ports description[input][input.input1] # Input port number, the format is input.input[N]name = "in_image" # Input port nametype = "uint8"  # input port data type ,e.g. float or uint8device = "cpu"  # input buffer type[input.input2] # Input port number, the format is input.input[N]name = "in_feat" # Input port nametype = "uint8"  # input port data type ,e.g. float or uint8device = "cpu"  # input buffer type# Output ports description[output][output.output1] # Output port number, the format is output.output[N]name = "has_hand" # Output port nametype = "float"  # output port data type ,e.g. float or uint8[output.output2] # Output port number, the format is output.output[N]name = "no_hand" # Output port nametype = "float"  # output port data type ,e.g. float or uint8

复制代码

Basic config 是一些单元名等基本配置，Flowunit Type 是功能单元类型，yolox_post 是一个条件单元，所以可以看到 condition 为 true，此外还有一些展开、归拢等性质，可以在 AI Gallery ModelBox)板块下看到更多案例。

config 为单元需要配置的一些属性，如本单元需要一些特征图 size、阈值等信息，所以在配置文件中修改 config 为：

[config]net_h = 320net_w = 320num_classes = 2conf_threshold = 0.5iou_threshold = 0.5

复制代码

此外，输入输出 type 根据实际逻辑可能进行一些修改：

# Input ports description[input][input.input1] # Input port number, the format is input.input[N]name = "in_image" # Input port nametype = "uint8"  # input port data type ,e.g. float or uint8device = "cpu"  # input buffer type[input.input2] # Input port number, the format is input.input[N]name = "in_feat" # Input port nametype = "float"  # input port data type ,e.g. float or uint8device = "cpu"  # input buffer type# Output ports description[output][output.output1] # Output port number, the format is output.output[N]name = "has_hand" # Output port nametype = "uint8"  # output port data type ,e.g. float or uint8[output.output2] # Output port number, the format is output.output[N]name = "no_hand" # Output port nametype = "uint8"  # output port data type ,e.g. float or uint8

复制代码

接下来，我们查看 yolox_post.py，可以看到创建单元时已经生成了基本接口：

# Copyright (c) Huawei Technologies Co., Ltd. 2022. All rights reserved.#!/usr/bin/env python# -*- coding: utf-8 -*-import _flowunit as modelboxclass yolox_postFlowUnit(modelbox.FlowUnit): # Derived from modelbox.FlowUnit def __init__(self): super().__init__() def open(self, config): # Open the flowunit to obtain configuration information return modelbox.Status.StatusCode.STATUS_SUCCESS def process(self, data_context): # Process the data in_data = data_context.input("in_1") out_data = data_context.output("out_1") # yolox_post process code. # Remove the following code and add your own code here. for buffer in in_data:            response = "Hello World " + buffer.as_object()            result = response.encode('utf-8').strip() add_buffer = modelbox.Buffer(self.get_bind_device(), result) out_data.push_back(add_buffer) return modelbox.Status.StatusCode.STATUS_SUCCESS def close(self): # Close the flowunit return modelbox.Status() def data_pre(self, data_context): # Before streaming data starts return modelbox.Status() def data_post(self, data_context): # After streaming data ends return modelbox.Status() def data_group_pre(self, data_context): # Before all streaming data starts return modelbox.Status() def data_group_post(self, data_context): # After all streaming data ends return modelbox.Status()

复制代码

如果功能单元的工作模式是 stream = false 时，功能单元会调用 open、process、close 接口；如果功能单元的工作模式是 stream = true 时，功能单元会调用 open、data_group_pre、data_pre、process、data_post、data_group_post、close 接口；用户可根据实际需求实现对应接口。

根据单元性质，我们主要需要完善 open、process 接口：

import _flowunit as modelboximport numpy as np from yolox_utils import postprocess, expand_bboxes_with_filter, draw_color_paletteclass yolox_postFlowUnit(modelbox.FlowUnit): # Derived from modelbox.FlowUnit def __init__(self): super().__init__() def open(self, config): self.net_h = config.get_int('net_h', 320) self.net_w = config.get_int('net_w', 320) self.num_classes = config.get_int('num_classes', 2) self.num_grids = int((self.net_h / 32) * (self.net_w / 32)) * (1 + 2*2 + 4*4) self.conf_thre = config.get_float('conf_threshold', 0.3) self.nms_thre = config.get_float('iou_threshold', 0.4) return modelbox.Status.StatusCode.STATUS_SUCCESS def process(self, data_context): modelbox.info("YOLOX POST") in_image = data_context.input("in_image") in_feat = data_context.input("in_feat") has_hand = data_context.output("has_hand") no_hand = data_context.output("no_hand") for buffer_img, buffer_feat in zip(in_image, in_feat):            width = buffer_img.get('width')            height = buffer_img.get('height')            channel = buffer_img.get('channel') img_data = np.array(buffer_img.as_object(), copy=False) img_data = img_data.reshape((height, width, channel)) feat_data = np.array(buffer_feat.as_object(), copy=False) feat_data = feat_data.reshape((self.num_grids, self.num_classes + 5))            ratio = (self.net_h / height, self.net_w / width) bboxes = postprocess(feat_data, (self.net_h, self.net_w), self.conf_thre, self.nms_thre, ratio)            box = expand_bboxes_with_filter(bboxes, width, height) if box: buffer_img.set("bboxes", box) has_hand.push_back(buffer_img) else: draw_color_palette(img_data) img_buffer = modelbox.Buffer(self.get_bind_device(), img_data) img_buffer.copy_meta(buffer_img) no_hand.push_back(img_buffer) return modelbox.Status.StatusCode.STATUS_SUCCESS def close(self): # Close the flowunit return modelbox.Status()

复制代码

可以看到，在 open 中我们进行了一些参数获取，process 进行逻辑处理，输入输出可以通过 data_context 来获取，值得注意的是输出时我们返回的是图，在检测到手时为图附加了检测框信息，该信息可以被下一单元获取。

同样的，完善其余通用功能单元，具体可以参考我们提供的代码。

应用运行

我们需要准备一个 mp4 文件拷贝到 data 文件夹下，我们提供了测试视频 hand.mp4，然后打开工程目录下 bin/mock_task.toml 文件，修改其中的任务输入和任务输出配置为如下内容：

# 任务输入,mock模拟目前仅支持一路rtsp或者本地url# rtsp摄像头，type = "rtsp", url里面写入rtsp地址# 其它用"url"，比如可以是本地文件地址, 或者httpserver的地址，(摄像头 url = "0")[input]type = "url"url = "../data/hand.mp4"# 任务输出,目前仅支持"webhook", 和本地输出"local"(输出到屏幕,url="0", 输出到rtsp，填写rtsp地址)# (local 还可以输出到本地文件，这个时候注意，文件可以是相对路径，是相对这个mock_task.toml文件本身)[output]type = "local"url = "../hilens_data_dir/paint.mp4"

复制代码

配置好后在工程路径下执行 build_project.sh 进行工程构建：

rock@rock-3a:~/lxy/examples$ cd workspace/hand_painting/rock@rock-3a:~/lxy/examples/workspace/hand_painting$ ./build_project.sh dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/graph/hand_painting.toml to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/graph/modelbox.conf to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/etc/flowunit/extract_roi/extract_roi.toml to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/etc/flowunit/painting/painting.toml to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/etc/flowunit/yolox_post/yolox_post.toml to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/model/hand_detection/hand_detection.toml to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/model/pose_detection/pose_detection.toml to Unix format...dos2unix: converting file /home/rock/lxy/examples/workspace/hand_painting/bin/mock_task.toml to Unix format...build success: you can run main.sh in ./bin folderrock@rock-3a:~/lxy/examples/workspace/hand_painting$

复制代码

构建完成后运行项目：

rock@rock-3a:~/lxy/examples/workspace/hand_painting$ ./bin/main.sh

复制代码

等待稍许即可以在 hilens_data_dir 文件夹下看到运行结果：

除了 mp4 外我们也支持很多其他类型的输入输出，ModelBox PC TOOL 也提供了推流与拉流功能，选择输入实时视频流，启动：

运行程序时配置输出地址为推流地址，即可在本机网页中查看到运行结果：

如果需要对应用进行性能评估，只需要在流程图配置文件中开启 profile：

[profile]profile=true # 启用profiletrace=true # 启用traceingdir="/tmp/modelbox/perf" # 设置跟踪文件路径

复制代码

配置启动后，启动运行流程图，profile 会每隔 60s 记录一次统计信息，trace 会在任务执行过程中和结束时，输出统计信息。

运行流程图后，会生成性能相关的 json 文件，通过将 json 文件加载到浏览器中即可查看 timeline 信息。

打开 chrome 浏览器。
浏览器中输入 chrome://tracing/。
点击界面中的 Load 按钮，加载 trace 的 json 文件。
加载成功后，将看到类似下面的 timeline 视图：

打包部署

打包

调试完成后，同样可以通过 create.py 脚本将应用打包发布：

./create.py -t rpm -n hand_painting

复制代码

控制台中输出：

sdk version is modelbox-rk-aarch64-1.0.8.8call mb-pkg-tool pack [folder] > [rpm file] to building rpm, waiting...success: create hand_painting.rpm in /home/rock/lxy/examples/workspace/hand_painting

复制代码

等待稍许，可以看到项目工程下已经生成了 rpm 文件夹和打包好的应用：