OpenVINO Running on LattePanda 3 Delta Single Board Computer (3) - NLP

Projects

Introduction

The OpenVINO platform offers a rich set of functionalities and model libraries for Natural Language Processing (NLP), including natural language generation, interactive question answering, and machine translation, among others. One of the highly regarded projects is the BERT-large interactive question answering model. BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language model based on the Transformer architecture, which comprehends word semantics in a bidirectional context, resulting in better language understanding. BERT-large is an extended version of BERT with more parameters and enhanced capabilities, suitable for more complex question answering and text processing tasks.

With the support of OpenVINO, the BERT-large interactive question answering model exhibits high performance and efficiency. Leveraging hardware acceleration and model optimization, it achieves fast inference and deployment, meeting the demands of real-time applications. OpenVINO also provides easy-to-use APIs and tools, enabling developers to build and deploy text processing applications more effortlessly.

213-Question-answering

This project employs a small BERT-large model (INT8) and leverages OpenVINO for interactive question answering, similar to embedding.

The project link：

https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/213-question-answering

Testing Steps:

1、Import Necessary Libraries: Import required libraries, including NumPy and the Core module from OpenVINO.

2、Download Models: Use the omz_downloader command-line tool to download the necessary model. Select the desired model, specify the model's download path and precision.

3、Load Model: Initialize OpenVINO with the OpenVINO Runtime object and read the network architecture and model weights from .xml and .bin files. Compile the network model to run on the specified device.

4、Prepare Input Data: Convert the question and context into token sequences required by the model and handle padding and segmentation based on input size.

5、Run Inference: Iterate through different input contexts and pass the input data to the compiled model for inference.

6、Post-processing: Perform post-processing on the raw results of the model, including using the softmax function to obtain probability distributions and finding the best answer.

7、Main Processing Function: Run the question-answering system based on the given knowledge base and question.

8、Execution: Run the question-answering system on the local paragraph, and modify the source paragraph and question as needed.

Test Results:

High accuracy, supports text or web link as data source.

Fast response time, 10 tokens < 1 second.

If the dataset does not include the asked question, it will answer randomly. It is recommended to include a question mark at the end of the question.

Summary

The testing covered four text processing projects: 213-Question-answering, 240-dolly-2-instruction-following, 214-grammar-correction, and 221-machine-translation. Additionally, related AIGC projects, such as 241-riffusion-text-to-music and 225-stable-diffusion-text-to-image, were tested. It was observed that 3 delta's performance may not meet the requirements of these projects, leading to kernel crashes during model invocation or execution. For similar project needs, a more powerful processor (e.g., sigma) is recommended.

A table summarizing the ability to run projects on 3 Delta:

	If 3 Delta support
213-Question-answering	1
240-dolly-2-instruction-following	0
221-machine-translation	0
241-riffusion-text-to-music	0
225-stable-diffusion-text-to-image	0

If you are interested in the performance of LLM running on a Single Board Computer (SBC), you can refer to the article about running large language models on LattePanda Sigma SBC at the provided link.

In addition to the mentioned projects, OpenVINO offers a wealth of other functionalities. You might be interested in exploring the following articles:

OpenVINO Running on LattePanda 3 Delta Single Board Computer (1) - Object Detection

OpenVINO Running on LattePanda 3 Delta Single Board Computer (2) - Text Recognition

OpenVINO Running on LattePanda 3 Delta Single Board Computer (4) - Pose Estimation

OpenVINO Running on LattePanda 3 Delta Single Board Computer (5) - Audio Processing