ModelBox 实战开发：RK3568 实现摄像头虚拟背景

作者：华为云开发者联盟

2023-06-19
广东
本文字数：8821 字
阅读完需：约 29 分钟

本文分享自华为云社区《ModelBox开发案例 - RK3568实现摄像头虚拟背景【玩转华为云】》，作者：AI 练习生。

本文将使用 ModelBox 端云协同 AI 开发套件（RK3568）实现摄像头虚拟背景 AI 应用的开发。

最终运行效果如下：

RK3568

本应用采用 RK 开发套件完成，需要具备 RK3568 开发板。本硬件可以通过以下链接购买：https://marketplace.huaweicloud.com/contents/2b73a21b-91c5-4c58-a61a-5a5460afeaf7 ，规格包含：RK3568 开发板×1、A2 WiFi 模块×1、外壳×1、电源线和电源插头×1、散热片×1。

模型训练

Notebook:

模型推理

Notebook:

onnx 结构：

模型转换

Notebook:

rknn 结构：

原生 onnxruntime 推理

我们准备了(1080p, 30fps)的视频，使用原生的 onnxruntime 进行推理，帧率最高是 7：

实现代码：

"""OpenCV 读取摄像头视频视频流，使用原生的onnxruntime推理"""
# 导入OpenCVimport cv2import timeimport drawUtilsimport numpy as npimport onnxruntime
# cap = cv2.VideoCapture(0)cap = cv2.VideoCapture('test.mp4')

if not cap.isOpened():    print('文件不存在或编码错误')
else:    i = 0    fps = 30    start_time = time.time()    font = cv2.FONT_HERSHEY_PLAIN    image_background = cv2.imread('R-C.jpg')    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))    onnx_model = onnxruntime.InferenceSession('linknet.onnx')    image_background = cv2.resize(image_background, (width, height))    writer = cv2.VideoWriter('test_result.mp4',cv2.VideoWriter_fourcc(*'X264'),fps,(width,height))    image_background = image_background/255.

while cap.isOpened():       ret,frame = cap.read()
    if ret:        img = cv2.resize(frame, (256, 256))        img = img[...,::-1]/255.        img = img.astype(np.float32)        img = np.transpose(img, (2, 0, 1))        data = np.expand_dims(img, axis=0)        onnx_input ={onnx_model.get_inputs()[0].name: data}        pred_mask = onnx_model.run(None, onnx_input)        pred_mask = np.array(pred_mask)        pred_mask = pred_mask[0][0][0]        pred_mask = cv2.resize(pred_mask, (width, height))        pred_mask = pred_mask.reshape((height, width, 1))        img_multi = pred_mask*frame/255.        mask_layer = np.ones((height, width)).reshape(height, width, 1)-pred_mask        mask_layer = image_background*mask_layer        image_add = img_multi+mask_layer        image_add = image_add*255        image_add = image_add.astype(np.uint8)               # 计算FPS        i += 1        now = time.time()        fps_text  = int(1 / ( now - start_time))        start_time = now        print('linknet post ' + str(i))
        # 添加中文（首先导入模块）        img_add = drawUtils.cv2AddChineseText(image_add, '帧率：'+str(fps_text), (20,50), textColor=(0, 255, 0), textSize=30)
        # 显示画面        # cv2.imshow('demo',img_add)        writer.write(img_add)        # 退出条件        if cv2.waitKey(1) & 0xFF == ord('q'):            break
    else:        break

cap.release()cv2.destroyAllWindows()

复制代码

修改使用编为 0 摄像头（默认为 PC 自带的摄像头）进行实时检测：

ModelBox AI 应用开发

直接使用工程

1）下载案例

本案例所需资源（代码、模型、测试数据等）均可从网盘链接下载。

2）运行应用

将 virtual_background 文件夹拖到工程目录 workspace 目录下面，开启性能统计：

切换到工程目录，执行 bin/main.bat 运行应用，生成的视频和性能统计文件都在 hilens_data_dir 文件夹下面：

3）性能统计

在 Chrome 浏览器 chrome://tracing 中加载性能统计文件：

该 AI 应用输入 1080p 的视频，batch_size=4 耗时约为 256ms，平均每帧处理速度为 64ms，fps=1000/64≈16，接下来我们给出该 AI 应用在 ModelBox 中的完整开发过程（以开发套件为例）。

4）摄像头检测

打开工程目录 bin/mock_task.toml 文件，修改其中的任务输入和任务输出，配置为如下内容：

# 用于本地mock文件读取任务，脚本中已经配置了IVA_SVC_CONFIG环境变量, 添加了此文件路径########### 请确定使用linux的路径类型，比如在windows上要用  D:/xxx/xxx  不能用D:\xxx\xxx  ############ 任务的参数为一个压缩并转义后的json字符串# 直接写需要转义双引号， 也可以用 content_file 添加一个json文件，如果content和content_file都存在content会被覆盖# content_file支持绝对路径或者相对路径，不支持解析环境变量(包括${HILENS_APP_ROOT}、${HILENS_DATA_DIR}等)[common]content = "{\"param_str\":\"string param\",\"param_int\":10,\"param_float\":10.5}"
# 任务输入配置，mock模拟目前仅支持一路rtsp或者本地url, 当前支持以下几种输入方式：# 1. rtsp摄像头或rtsp视频流：type="rtsp", url="rtsp://xxx.xxx"  (type为rtsp的时候，支持视频中断自动重连)# 2. 设备自带摄像头或者USB摄像头：type="url"，url="摄像头编号,比如 0 或者 1 等" (需配合local_camera功能单元使用)# 3. 本地视频文件：type="url"，url="视频文件路径" (可以是相对路径 -- 相对这个mock_task.toml文件, 也支持从环境变量${HILENS_APP_ROOT}所在目录文件输入)# 4. http服务：type="url", url="http://xxx.xxx"(指的是任务作为http服务启动，此处需填写对外暴露的http服务地址，需配合httpserver类的功能单元使用)# 5. 支持多输入[input] [input1] [input2] ...，对应的输出为[output] [output1] [output2] ...，如果使用videoout功能单元输出，则输入和输出个数必须匹配，同时url不能重名[input]type = "url"url = "0"#url = "../data/test.mp4"
# 任务输出配置，当前支持以下几种输出方式：# 1. rtsp视频流：type="local", url="rtsp://xxx.xxx"# 2. 本地屏幕：type="local", url="0:xxx" (设备需要接显示器，系统需要安装桌面)# 3. 本地视频文件：type="local"，url="视频文件路径" (可以是相对路径——相对这个mock_task.toml文件, 也支持输出到环境变量${HILENS_DATA_DIR}所在目录或子目录)# 4. http服务：type="webhook", url="http://xxx.xxx" (指的是任务产生的数据上报给某个http服务，此处需填写上传的http服务地址)[output]type = "local"url = "0"#url = "../hilens_data_dir/test_result.mp4"

复制代码

执行.\bin\main.bat camera 运行应用，将会自动弹出实时的人像分割画面：

从零开发工程

如果你对项目开发感兴趣，可以通过本章进一步了解。

1）远程连接开发板

我们推荐在 PC 端使用 VS Code 远程连接开发板来对设备进行操作，安装 Remote-SSH：

2）查看设备 ip 地址

可以在 APP 应用智慧生活上查看设备的 ip 地址：

3）配置 SSH 连接

电脑和设备处于同一 WiFi 下，在 VS Code 中使用 Remote-SSH 远程登陆：

RK3568 应用开发

创建工程

在 SDK 目录下使用 create.py 脚本创建工程，我决定工程取名为 virtual_background：

rock@rock-3a:~$ cd /home/rock/modelboxrock@rock-3a:~/modelbox$ ./create.py -t server -n virtual_backgroundsdk version is modelbox-rk-aarch64-1.5.3success: create virtual_background in /home/rock/modelbox/workspace

复制代码

创建推理功能单元

rock@rock-3a:~/modelbox$ ./create.py -t infer -n linknet_infer -p virtual_backgroundsdk version is modelbox-rk-aarch64-1.5.3success: create infer linknet_infer in /home/rock/modelbox/workspace/virtual_background/model/linknet_infer

复制代码

可以看到推理功能单元创建在了项目工程的 model 目录下面：

将我们转换好的模型 linknet.rknn 拖到 linknet_infer 目录下，接着编辑.toml 配置文件，主要修改模型的路径与输入输出，由于我们模型有一个来自 rknpu 的 uint8 的类型输入和一个 float 类型的输出，所以对配置文件编辑如下：

# Copyright (C) 2020 Huawei Technologies Co., Ltd. All rights reserved.
[base]name = "linknet_infer"device = "rknpu"version = "1.0.0"description = "your description"entry = "./linknet.rknn"  # model file path, use relative pathtype = "inference"virtual_type = "rknpu2" # inference engine type: rockchip now support rknpu, rknpu2(if exist)group_type = "Inference"  # flowunit group attribution, do not changeis_input_contiguous = "false" # rk do not support memory combine, fix, do not change
# Input ports description[input][input.input1]  # input port number, Format is input.input[N]name = "Input"  # input port nametype = "uint8"  # input port data type ,e.g. float or uint8device = "rknpu"  # input buffer type: use rknpu for zero-copy, cpu also allow
# Output ports description[output][output.output1] # output port number, Format is output.output[N]name = "Output"  # output port nametype = "float"   # output port data type ,e.g. float or uint8

复制代码

可以看到该模型有 1 个输入节点，1 个输出节点。需要注意其中的 virtual_type 配置与 npu 类别有关，RK3568 需配置为 rknpu2；输入节点的 device 配置建议设为与该推理功能单元的上一个功能单元相同。

ModelBox 内置了 rknn 推理引擎和推理逻辑，开发者只需要准备好模型文件、编辑好配置文件，即可使用该模型进行推理，无需编写推理代码。

另外，本案例使用的人像分割模型是由 Pytorch 框架训练得到，我们事先使用rknn-toolkit2工具将它转换为 RK3568 支持的模型格式，感兴趣的话可以在RK3568模型转换查看模型转换过程。

创建后处理功能单元

我们需要一个后处理功能单元来对模型推理结果进行解码，依然是万能的 create.py 脚本：


rock@rock-3a:~/modelbox$ ./create.py -t python -n linknet_post -p virtual_backgroundsdk version is modelbox-rk-aarch64-1.5.3success: create python linknet_post in /home/rock/modelbox/workspace/virtual_background/etc/flowunit/linknet_post

复制代码

可以看到在项目工程的 etc/flowunit 目录下面已经生成了该功能单元，存放.toml 配置文件与.py 功能代码文件：

接下来补充该功能单元的逻辑代码，如果对此不感兴趣，可以将我们资源包中的代码 CtrlC+V 速通本节。

首先补充后处理功能单元的配置文件，对于后处理功能单元，我们需要知道模型推理的 shape，因此需要对 config 字段进行配置。此外，我们还需要修改输入输出，接收一个 float 类型的推理结果与一个 uint8 类型的原图，输出融合后的新图：


# Copyright (c) Huawei Technologies Co., Ltd. 2022. All rights reserved.
# Basic config[base]name = "linknet_post" # The FlowUnit namedevice = "cpu" # The flowunit runs on cpuversion = "1.0.0" # The version of the flowunittype = "python" # Fixed value, do not changedescription = "description" # The description of the flowunitentry = "linknet_post@linknet_postFlowUnit" # Python flowunit entry functiongroup_type = "Generic"  # flowunit group attribution, change as Input/Output/Image/Generic ...
# Flowunit Typestream = false # Whether the flowunit is a stream flowunitcondition = false # Whether the flowunit is a condition flowunitcollapse = false # Whether the flowunit is a collapse flowunitcollapse_all = false # Whether the flowunit will collapse all the dataexpand = false #  Whether the flowunit is a expand flowunit
# The default Flowunit config[config]mask_h = 256mask_w = 256
# Input ports description[input][input.input1] # Input port number, the format is input.input[N]name = "in_mask" # Input port nametype = "float" # Input port type
[input.input2] # Input port number, the format is input.input[N]name = "in_image" # Input port nametype = "uint8" # Input port type
# Output ports description[output][output.output1] # Output port number, the format is output.output[N]name = "out_image" # Output port nametype = "uint8" # Output port type

复制代码

后处理代码：


# Copyright (c) Huawei Technologies Co., Ltd. 2022. All rights reserved.
#!/usr/bin/env python# -*- coding: utf-8 -*-import _flowunit as modelboximport numpy as npimport cv2
class linknet_postFlowUnit(modelbox.FlowUnit):    # Derived from modelbox.FlowUnit    def __init__(self):        super().__init__()        self.image_background = cv2.imread('data/R-C.jpg')        self.image_background = cv2.cvtColor(self.image_background,cv2.COLOR_BGR2RGB)        self.image_background = self.image_background/255.
    def open(self, config):        # Open the flowunit to obtain configuration information        self.mask_h = config.get_int('mask_h', 256)        self.mask_w = config.get_int('mask_w', 256)
        self.index = 0          return modelbox.Status.StatusCode.STATUS_SUCCESS
    def process(self, data_context):        # Process the data        in_mask = data_context.input("in_mask")        in_image = data_context.input("in_image")        out_image = data_context.output("out_image")
        # linknet_post process code.        # Remove the following code and add your own code here.        for buffer_mask, buffer_image in zip(in_mask, in_image):            # 获取输入Buffer的属性信息            width = buffer_image.get('width')            height = buffer_image.get('height')            channel = buffer_image.get('channel')
            # 将输入Buffer转换为numpy对象            image_background = cv2.resize(self.image_background, (width, height))            mask_data = np.array(buffer_mask.as_object(), copy=False)            mask_data = mask_data.reshape(self.mask_h, self.mask_w)            mask_data = cv2.resize(mask_data, (width, height))            mask_data = mask_data.reshape(height, width, 1)
            image_data = np.array(buffer_image.as_object(), dtype=np.uint8, copy=False)            image_data = image_data.reshape(height, width, channel)
            img_multi = mask_data*image_data/255.
            mask_layer = np.ones((height, width))            mask_layer = mask_layer.reshape(height, width, 1)            mask_layer = mask_layer-mask_data                       mask_layer = image_background*mask_layer            image_add = img_multi+mask_layer            image_add = image_add*255            image_add = image_add.astype(np.uint8)
            # frame计数            self.index += 1            print("linknet_post " + str(self.index))
            # 将业务处理返回的结果数据转换为Buffer            add_buffer = modelbox.Buffer(self.get_bind_device(), image_add)            # 设置输出Buffer的Meta信息，此处直接拷贝输入Buffer的Meta信息            add_buffer.copy_meta(buffer_image)            # 将输出Buffer放入输出BufferList中            out_image.push_back(add_buffer)
        return modelbox.Status.StatusCode.STATUS_SUCCESS
    def close(self):        # Close the flowunit        return modelbox.Status()
    def data_pre(self, data_context):        # Before streaming data starts        return modelbox.Status()
    def data_post(self, data_context):        # After streaming data ends        return modelbox.Status()
    def data_group_pre(self, data_context):        # Before all streaming data starts        return modelbox.Status()
    def data_group_post(self, data_context):        # After all streaming data ends        return modelbox.Status()

复制代码

搭建流程图

修改 virtual_background.toml 内容：


# Copyright (C) 2020 Huawei Technologies Co., Ltd. All rights reserved.
[driver]dir = ["${HILENS_APP_ROOT}/etc/flowunit","${HILENS_APP_ROOT}/etc/flowunit/cpp","${HILENS_APP_ROOT}/model","${HILENS_MB_SDK_PATH}/flowunit"]skip-default = true[profile]profile=truetrace=truedir="${HILENS_DATA_DIR}/mb_profile"[graph]format = "graphviz"graphconf = """digraph virtual_background {    node [shape=Mrecord]    queue_size = 1    batch_size = 1       # 定义节点，即功能单元及其属性    input1[type=input, flowunit=input, device=cpu, deviceid=0]    data_source_parser[type=flowunit, flowunit=data_source_parser, device=cpu, deviceid=0]    video_demuxer[type=flowunit, flowunit=video_demuxer, device=cpu, deviceid=0]    video_decoder[type=flowunit, flowunit=video_decoder, device=rknpu, deviceid=0, pix_fmt="rgb"]    image_resize[type=flowunit, flowunit=resize, device=rknpu, deviceid=0, image_width=256, image_height=256]    linknet_infer[type=flowunit, flowunit=linknet_infer, device=rknpu, deviceid=0]    linknet_post[type=flowunit, flowunit=linknet_post, device=cpu, deviceid=0]    video_out[type=flowunit, flowunit=video_out, device=rknpu, deviceid=0]       # 定义边，即功能间的数据传递关系    input1:input -> data_source_parser:in_data    data_source_parser:out_video_url -> video_demuxer:in_video_url    video_demuxer:out_video_packet -> video_decoder:in_video_packet    video_decoder:out_video_frame -> image_resize:in_image    image_resize:out_image -> linknet_infer:Input    linknet_infer:Output -> linknet_post:in_mask    video_decoder:out_video_frame -> linknet_post:in_image    linknet_post:out_image -> video_out:in_video_frame}"""[flow]desc = "virtual_background run in modelbox-rk-aarch64"

复制代码

其中，profile 字段设为 true 启用性能统计功能。

运行应用

应用的输入和输出可以在项目工程的 bin/mock_task.toml 中进行配置：

配置应用的输入输出，接下来就可以进入项目进行构建和运行了：

执行 bin/main.sh 运行应用，生成的视频和性能统计文件在 hilens_data_dir 文件夹：

可以右键下载查看性能统计文件：

可以看到模型推理确实很快，平均每 4 次推理耗时 223ms，fps 约等于 18 帧每秒。

开发板摄像头检测

创建 virtual_background_cameral.toml 文件，编辑内容如下：


# Copyright (C) 2020 Huawei Technologies Co., Ltd. All rights reserved.
[driver]dir = ["${HILENS_APP_ROOT}/etc/flowunit","${HILENS_APP_ROOT}/etc/flowunit/cpp","${HILENS_APP_ROOT}/model","${HILENS_MB_SDK_PATH}/flowunit"]skip-default = true[profile]profile=falsetrace=falsedir="${HILENS_DATA_DIR}/mb_profile"[graph]format = "graphviz"graphconf = """digraph virtual_background {    node [shape=Mrecord]    queue_size = 1    batch_size = 1       # 定义节点，即功能单元及其属性    input1[type=input, flowunit=input, device=cpu, deviceid=0]    data_source_parser[type=flowunit, flowunit=data_source_parser, device=cpu, deviceid=0]    local_camera[type=flowunit, flowunit=local_camera, device=rknpu, deviceid=0, pix_fmt="rgb", cam_width=1280, cam_height=720]    image_resize[type=flowunit, flowunit=resize, device=rknpu, deviceid=0, image_width=256, image_height=256]    linknet_infer[type=flowunit, flowunit=linknet_infer, device=rknpu, deviceid=0]    linknet_post[type=flowunit, flowunit=linknet_post, device=cpu, deviceid=0]    video_out[type=flowunit, flowunit=video_out, device=rknpu, deviceid=0]       # 定义边，即功能间的数据传递关系    input1:input -> data_source_parser:in_data    data_source_parser:out_video_url -> local_camera:in_camera_packet    local_camera:out_camera_frame -> image_resize:in_image    image_resize:out_image -> linknet_infer:Input    linknet_infer:Output -> linknet_post:in_mask    local_camera:out_camera_frame -> linknet_post:in_image    linknet_post:out_image -> video_out:in_video_frame}"""[flow]desc = "virtual_background run in modelbox-rk-aarch64"

复制代码

打开工程目录下 bin/mock_task.toml 文件，修改任务输入和任务输出：

执行 bin/main.sh camera 运行应用，使用 rtsp 推流到本地进行查看：

至此我们就完成了摄像头虚拟背景 AI 应用的开发以及在 Windows 和 RK3568 开发板上的部署，本案例所需资源（代码、模型、测试数据等）均可从网盘链接下载，感兴趣的小伙伴赶快下载玩一玩吧！

点击关注，第一时间了解华为云新鲜技术~

发布于: 刚刚阅读数: 3

原文链接:【http://xie.infoq.cn/article/6fe0621731e578a94b6356e97】。文章转载请联系作者。

华为云开发者联盟

关注

提供全面深入的云计算技术干货 2020-07-14 加入

生于云，长于云，让开发者成为决定性力量

发布

暂无评论

创作场景

ModelBox 实战开发：RK3568 实现摄像头虚拟背景

RK3568

模型训练

模型推理

模型转换

原生 onnxruntime 推理

ModelBox AI 应用开发

直接使用工程

1）下载案例

2）运行应用

3）性能统计

4）摄像头检测

从零开发工程

1）远程连接开发板

2）查看设备 ip 地址

3）配置 SSH 连接

RK3568 应用开发

创建工程

创建推理功能单元

创建后处理功能单元

搭建流程图

运行应用

开发板摄像头检测

华为云开发者联盟

评论