5、yolov5+tensorrt acceleration(jetson)

tensorrt source code:https://github.com/wang-xinyu/tensorrtx

Official installation tutorial::https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html

Official tutorial:https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html

5.1. Introduction

TensorRT is a high-performance deep learning inference (Inference) optimizer that can provide low-latency, high-throughput deployment inference for deep learning applications. TensorRT can be used to accelerate inference in ultra-large-scale data centers, embedded platforms, or autonomous driving platforms. TensorRT can now support almost all deep learning frameworks such as TensorFlow, Caffe, Mxnet, and Pytorch. Combining TensorRT with NVIDIA GPUs can enable fast and efficient deployment and inference in almost all frameworks.

Can TensorRT speed up models?

Yes! According to official documentation, using TensorRT, it can provide 10X or even 100X acceleration in CPU or GPU mode. TensorRT provides 20X acceleration.

tensorRT is only responsible for the inference process of the model and optimizes the trained model. Generally, TensorRT is not used to train the model. tensorRT is just an inference optimizer. After your network is trained, you can throw the training model file directly into tensorRT without relying on the deep learning framework (Caffe, TensorFlow, etc.), as follows:

1351

1352

It can be considered that tensorRT is a deep learning framework with only forward propagation. This framework can parse the network models of Caffe and TensorFlow, and then map them one by one to the corresponding layers in tensorRT, and uniformly convert all models of other frameworks into tensorRT. Then in tensorRT, you can implement optimization strategies for NVIDIA's own GPUs and accelerate deployment.

5.2. Use

Support real-time monitoring of web pages, for example:

View node information

1353

Print detection information

Print as follows

5.3. TensorRT deployment process

5.3.1. Generate .wts file

For example:yolov5s.wts

Copy ~/software/tensorrtx-yolov5-v7.0/gen_wts.py and [yolov5-5.0/weights/yolov5s.pt] weight file (or the weight file downloaded from Baidu can also be used, take [yolov5s.pt] as an example)) Copy it to the [~/software/yolov5] folder, and execute gen_wts.py in this directory to generate a .wts file.

Copy the generated [yolov5s.wts] file to the ~/software/tensorrtx-yolov5-v7.0/yolov5 directory, and compile tensorrtx in this directory

Modify kNumClass in config.h to yours. Because the official one uses the coco data set, the default is 80.

Execute makeFile. (Every time it is modified to kNumClass, make is required)

5.3.2. Generate .engine file

Generate .engine file (I use yolov5s, so I use s at the end)

You can use python files to test, you need to create yaml files.

For example:trash.yaml

5.4. ROS deployment

Copy the generated [yolov5s.engine] and [libmyplugins.so] files to the [yahboomcar_yolov5] function package [param/OrinNX] folder, for example:

When using it, just follow the steps in [5.2].

Write yaml file

For example:coco.yaml

If the file name changes, you need to modify the [yolov5.py] file and open it in the file path

As shown below

The parameters of [file_yaml], [PLUGIN_LIBRARY], and [engine_file_path] must correspond to the names used.