General

Use OpenVINO to Deploy ONNX Models

userHead LattePanda 2020-11-30 16:35:35 10842 Views0 Replies

Hello, fellow panda lovers!

Here is yet another wonderful post from the phenomenal AI Developer Contest that DFRobot held back in August. This post has just been translated from the original Chinese to English for your convenience, but the original post by community user Lao Xu can be found here. When reposting this article, please give credit where credit is due, and please enjoy!
 

Using OpenVINO to Deploy ONNX Models

While making the design, training and deployment of deep learning neural networks, such complex networks may often cause problems when used on different operating systems, different deep learning frameworks, different deployment hardware, and different versions. Due to the incompatibility with each other, it causes great inconvenience to developers.

The joint use of OpenVINO and ONNX can solve the problem of rapid deployment from different framework models to different hardware.

I recently participated in the "Intel® OpenVINO™ Alliance DFRobot Industry AI Developer Competition" event. The organizer provided participants with a LattePanda and Intel Neural Compute Stick NCS2. The data listed in this article is all obtained by running programs on this platform.

Hardware Piece 1: LattePanda Delta (LattePanda)

It uses Intel’s new N-series Celeron 4-core processor, which has up to 2.40 GHz, 4GB memory, built-in Bluetooth and WiFi modules, supports USB 3.0 interface, HDMI video output, 3.5mm audio interface, 100/1000Mbps Ethernet port, and an additional MicroSD expansion card slot. Integrated into the board is an Arduino Leonardo single-chip microprocessor, which allows the board to expand its own functions by interfacing with various sensor modules, and the main board supports Windows and Linux dual operating systems. It is a perfect choice in terms of function and price.

Hardware Piece 2: Intel Neural Compute Stick 2 (NCS2)

Intel® Movidius™ Myriad™ X VPU core has a USB 3.1 Type-A interface and supports TensorFlow, Caffe, MXNet, ONNX, PyTorch/ PaddlePaddle (via ONNX).

Software environment: OpenVINO, Ubuntu, Windows® 10

First, let’s take a picture of the hardware. It’s really compact. Connect the mouse, keyboard, monitor, USB camera and Bluetooth speakers. Now, let’s take a look at the performance of this palm-sized computer!

Why choose ONNX and OpenVINO?

The Open Neural Network Exchange (ONNX, Open Neural Network Exchange) is an open file format designed for machine learning to store trained models. It enables different artificial intelligence frameworks (such as Pytorch, MXNet) to store model data in the same format and interact with each other.

The specifications and codes of ONNX are primarily jointly developed by companies such as Microsoft, Amazon, Facebook and IBM, and are hosted on Github in an open source manner. Currently, the official deep learning frameworks that support loading ONNX models and inference engines are: Caffe2, PyTorch, MXNet, ML.NET, TensorRT and Microsoft CNTK, and TensorFlow also unofficially supports ONNX. ---Wikipedia

OpenVINO is a deep learning-based computer vision acceleration optimization framework recently launched by Intel that supports compression optimization and accelerated computing functions for other machine learning platform models. It mainly includes two core components and a pre-training model library:

OpenVINO Core Components - The Model Optimizer

Model Optimizer--the deep learning framework supported by the model optimizer includes:

-ONNX -TensorFlow -Caffe -MXNet

OpenVINO's Core Components - The Inference Engine

The Inference Engine supports the accelerated operations of deep learning models at the hardware instruction set level. At the same time, the traditional OpenCV image processing library has also been optimized for the instruction set, which has significantly improved performance and speed. The supported hardware platforms include the following:

-CPU -GPU -FPGA -MYRIAD (Intel Accelerated Computing Stick) -HDDL -GAN

ONNX is a kind of universal currency. Developers can save their own developed and trained models as ONNX files; and deployment engineers can use OpenVINO to deploy ONNX on different hardware platforms without worrying about which kind of framework the developer used.

Therefore, as long as your model can be converted to an ONNX model, and the ONNX model can be efficiently and quickly deployed on Intel’s CPU, GPU, neural computing stick, or even FPGA through OpenVINO, then your model development, training and deployment can be separate, and do not have to be troubled by different software and hardware development environments, greatly improving the efficiency of deployment.

How to Deploy ONNX Model Using OpenVINO

First, you need to install OpenVINO...

I am using the latest version of OpenVINO 2020.4. For specific installation settings, please refer to the following link: https://docs.openvinotoolkit.org/2020.4/index.html

... then, you need to have an ONNX model.

You can download public ONNX files trained by others, or convert from other framework models. Of course, you can train them yourself and save them in ONNX format. This article is for those who choose to train the program themselves.

I used the mnist handwritten data set and the printed data set to build a convolutional neural network and trained it to achieve an accuracy of 96.4%, and saved the training results as an ONNX model:

Use mo.py to Optimize the ONNX Model:

The commands are as follows:

python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\model_optimizer\mo.py" --input_model=Xubett964.onnx --output_dir=. --model_name=Xubett964.fp16 --data_type=FP16

Model Optimizer arguments:

Common parameters:

- Path to the Input Model: C:\openVINOdemo\shuatiP\Xubett964.onnx

- Path for generated IR: C:\openVINOdemo\shuatiP\.

- IR output name: Xubett964.fp16

- Log level: ERROR

- Batch: Not specified, inherited from the model

- Input layers: Not specified, inherited from the model

- Output layers: Not specified, inherited from the model

- Input shapes: Not specified, inherited from the model

- Mean values: Not specified

- Scale values: Not specified

- Scale factor: Not specified

- Precision of IR: FP16

- Enable fusing: True

- Enable grouped convolutions fusing: True

- Move mean values to preprocess section: False

- Reverse input channels: False

ONNX specific parameters:

Model Optimizer version:

[ SUCCESS ] Generated IR version 10 model.

[ SUCCESS ] XML file: C:\openVINOdemo\shuatiP\.\Xubett964.fp16.xml

[ SUCCESS ] BIN file: C:\openVINOdemo\shuatiP\.\Xubett964.fp16.bin

[ SUCCESS ] Total execution time: 11.34 seconds.

Use benchmark_app.py to Evaluate the Above Model:

You can change the parameters, -d GPU, -d MYRIAD, to evaluate the operation of GPU and neural computing stick. The commands are as follows:

python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\tools\benchmark_tool\benchmark_app.py" -m Xubett964.fp16.xml -i images\image1.png -d CPU

[Step 1/11] Parsing and validating input arguments

C:\Program Files (x86)\IntelSWTools\openvino\python\python3.6\openvino\tools\benchmark\main.py:29: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead

logger.warn(" -nstreams default value is determined automatically for a device. "

[ WARNING ] -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.

[Step 2/11] Loading Inference Engine

[ INFO ] InferenceEngine:

API version............. 2.1.2020.4.0-359-21e092122f4-releases/2020/4

[ INFO ] Device info

CPU

MKLDNNPlugin............ version 2.1

Build................... 2020.4.0-359-21e092122f4-releases/2020/4

[Step 3/11] Setting device configuration

[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.

[Step 4/11] Reading the Intermediate Representation network

[ INFO ] Read network took 31.25 ms

[Step 5/11] Resizing network to match image sizes and given batch

[ INFO ] Network batch size: 1

[Step 6/11] Configuring input of the model

[Step 7/11] Loading the model to the device

[ INFO ] Load network took 147.20 ms

[Step 8/11] Setting optimal runtime parameters

[Step 9/11] Creating infer requests and filling input blobs with images

[ INFO ] Network input 'imageinput' precision FP32, dimensions (NCHW): 1 1 28 28

C:\Program Files (x86)\IntelSWTools\openvino\python\python3.6\openvino\tools\benchmark\utils\inputs_filling.py:76: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead

",".join(BINARY_EXTENSIONS)))

[ WARNING ] No supported binary inputs found! Please check your file extensions: BIN

[ WARNING ] Some image input files will be ignored: only 0 files are required from 1

[ INFO ] Infer Request 0 filling

[ INFO ] Fill input 'imageinput' with random values (some binary data is expected)

[ INFO ] Infer Request 1 filling

[ INFO ] Fill input 'imageinput' with random values (some binary data is expected)

[ INFO ] Infer Request 2 filling

[ INFO ] Fill input 'imageinput' with random values (some binary data is expected)

[ INFO ] Infer Request 3 filling

[ INFO ] Fill input 'imageinput' with random values (some binary data is expected)

[Step 10/11] Measuring performance (Start inference asyncronously, 4 inference requests using 4 streams for CPU, limits: 60000 ms duration)

[Step 11/11] Dumping statistics report

Count: 768280 iterations

Duration: 60002.68 ms

Latency: 0.28 ms

Throughput: 12804.09 FPS

Write Programs for Inference Engine Verification

You can refer to OpenVINO's own examples. Python examples are in the following path:

C:\Program Files (x86)\IntelSWTools\openvino\inference_engine\samples\python

Or you can also refer to my code.

Code: Select all

#Import IE/OpenCV/numpy/time module from openvino.inference_engine import IECore, IENetwork import cv2 import numpy as np from time import time #Configuration inferred computing device, IR file path, image path DEVICE = 'MYRIAD' #DEVICE = 'CPU' model_xml = 'C:/openVINOdemo/shuatiP/Xubett964.fp16.xml' model_bin = 'C:/openVINOdemo/shuatiP/Xubett964.fp16.bin' image_file = 'C:/openVINOdemo/shuatiP/images/image5.png' labels_map = ["+", "-", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "=", "×", "÷"] #Initialize the plugin, output the version number of the plugin ie = IECore() ver = ie.get_versions(DEVICE)[DEVICE] print("{descr}: {maj}.{min}.{num}".format(descr=ver.description, maj=ver.major, min=ver.minor, num=ver.build_number)) #Read IR model file net = ie.read_network(model=model_xml, weights=model_bin) #Prepare input and output tensor print("Preparing input blobs") input_blob = next(iter(net.inputs)) out_blob = next(iter(net.outputs)) net.batch_size = 1 #Load model to AI inference computing device print("Loading IR to the plugin...") exec_net = ie.load_network(network=net, num_requests=1, device_name=DEVICE) #Reading pictures n, c, h, w = net.inputs[input_blob].shape image = cv2.imread(image_file,0) #Execute inference calculation print("Starting inference in synchronous mode") start = time() res = exec_net.infer(inputs={input_blob: image}) end = time() print("Infer Time:{}ms".format((end-start)*1000)) # Process output print("Processing output blob") res = res[out_blob] indx=np.argmax(res) label=labels_map[indx] print("Label: ",label) print("Inference is completed") #Display processing results cv2.imshow("Detection results",image) cv2.waitKey(0) cv2.destroyAllWindows() Run digit_detector.py:

python digit_detector.py

myriadPlugin: 2.1.2020.4.0-359-21e092122f4-releases/2020/4

Preparing input blobs

digit_detector.py:27: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property.

input_blob = next(iter(net.inputs))

Loading IR to the plugin...

Starting inference in synchronous mode

Infer Time:8.739948272705078ms

Processing output blob

Label: 7

Inference is completed

As shown in the figure above, image5.png is a picture of 7, and the inference result shows 7, indicating that the inference is correct.

Summary

The model file in ONNX format can be used as a bridge between different deep learning frameworks. OpenVINO provides an optimized deployment plan for ONNX models, so that it can be quickly deployed to Intel related hardware such as Latte Panda and Intel Neural Compute Stick NCS2, which can then speed up the process from deep learning models to actual application inference deployments.

Lao Xu 2020.7