Target detection inference

1.Using DetectNet to locate objects

The previous recognition example output represents the class probability of the entire input image. Next, we will focus on object detection and find the positions of various objects in the frame by extracting bounding boxes. Unlike image classification, object detection networks can detect many different objects per frame.

The detectNet object accepts images as input and outputs a list of coordinates for the detected bounding boxes, along with their categories and confidence values. DetectNet can be used from Python and C++. For various pre trained detection models available for download, please refer to the following text. The default model used is the 91 class SSD-Mobilenet-v2 model trained on the MS COCO dataset, which achieved real-time inference performance using TensorRT on Jetson.

2.Detect objects from images

Firstly, let's try using the detectnet program to locate objects in static images. In addition to the input/output path, there are also some additional command-line options:

If you are using a Docker container, it is recommended to save the output image to the directory mounted on images/test. Then, under Jetson inference/data/images/test, these images can be easily viewed from the host device

Note: Before running the case, if you are building your own environment, you must download the resnet-18.tar.gz model file to the network folder to run the above program. You can directly input the above program using the image we provide.

Please ensure that your terminal is located in the aarch64/bin directory:

Here are some examples of using the default SSD-Mobilenet-v2 model to detect pedestrians in images:

Note: TensorRT will take a few minutes to optimize the network when running each model for the first time. This optimized network file is then cached on disk, so future runs using this model will load faster.

3.Processing video files

You can store videos in the images folder at the path

Run program:

effect:

You can use the -- threshold setting to change the detection sensitivity up or down (default value is 0.5).

4.Run real-time camera recognition demonstration

The detectnet.cpp/detectnet.py sample we previously used can also be used for real-time camera streaming. The supported camera types include:

Here are some typical scenarios for launching programs on camera feeds

.C++

python

The OpenGL window displays a real-time camera stream that covers the bounding box of the detected object. Please note that SSD based models currently have the highest performance. This is an example of using a model:

If the desired object is not detected in the video feed, or if you receive false detection, try using the -- threshold parameter to lower or increase the detection threshold (default value is 0.5).