3. Basic use of PyTorch

image-20220224160612500

Raspberry Pi motherboard series does not currently support PyTorch functions.

3.1. About PyTorch

3.1.1. Introduction

PyTorch is an open source Python machine learning library based on Torch for applications such as natural language processing.

3.1.2, Features

1), Powerful GPU-accelerated tensor calculation

2), Deep neural network of automatic differentiation system

3), Dynamic graph mechanism

3.2, Tensors in PyTorch

3.2.1, Tensors

Tensors are called Tensors in English. They are the basic computing units in PyTorch. They represent a multi-dimensional matrix just like Numpy's ndarray. The biggest difference from ndarray is that PyTorch's Tensor can run on GPU, while Numpy's ndarray can only run on CPU. Running on GPU greatly speeds up the computing speed.

3.2.2, create tensors

1), there are many ways to create tensors, calling different API interfaces can create different types of tensors,

a = torch.empty(2,2): create an uninitialized 2*2 tensor

b = torch.rand(5, 6): create a uniformly distributed initialized tensor with each element from 0-1

c = torch.zeros(5, 5, dtype=torch.long): create an initialized all-zero tensor and specify the type of each element as long

d = c.new_ones(5, 3, dtype=torch.double): create a new tensor d based on the known tensor c

d.size(): get the shape of tensor d

2), operations between tensors

The operations between tensors are actually operations between matrices. Due to the dynamic graph mechanism, mathematical calculations can be performed directly on tensors, for example,

For this part of the code, please refer to: /home/yahboom/YBAMR-COBOT-EDU-00001/src/yahboom_navrobo_other/Pytorch/torch_tensor.py

Run the code,

image-20220301093522387

3.3, torchvision package introduction

3.3.1, torchvision is a library in Pytorch specifically used to process images, which contains four major categories:

1), torchvision.datasets: load datasets, Pytorch has many datasets such as CIFAR, MNIST, etc., you can use this class to load datasets, the usage is as follows:

2), torchvision.models: load trained models, including the following VCG, ResNet, etc., the usage is as follows:

3), torchvision.transforms: Class for image conversion operations, usage is as follows:

4), torchvision.untils: Arrange the images into a grid shape, usage is as follows:

For more information about the use of the torchvision package, please refer to the official website documentation: https://pytorch.org/vision/0.8/datasets.html

3.4, Convolutional Neural Network

3.4.1, Neural Network

1), Difference between Neural Network and Machine Learning

Neural Network and Machine Learning are both used for classification tasks. The difference is that Neural Network is more efficient than Machine Learning, the data is simpler, and fewer parameters are required to perform tasks. The following points are explained:

3.4.2, Convolutional Neural Network

1), Convolution Kernel

Convolution kernel can be understood as feature extractor, filter (digital signal processing), etc. Neural network has three layers (input layer, hidden layer, output layer), neurons in each layer can share convolution kernel, so it is very convenient to process high-order data. We only need to design the size, number and sliding step of convolution kernel to let it train itself.

2), Three basic layers of convolutional neural network:

Perform convolution operation, inner product operation of two matrices of convolution kernel size, multiply and add numbers in the same position. Convolution layer close to the input layer sets a small number of convolution kernels, and the more convolution kernels are set in the later layer, the more convolution kernels are set.

Compress images and parameters by downsampling, but will not damage the quality of the image. There are two pooling methods, MaxPooling (that is, taking the largest value in the sliding window) and AveragePooling (taking the average of all values ​​in the sliding window).

This layer is mainly a stacking layer. After the pooling layer, the image is compressed and then enters the Flatten layer; the output of the Flatten layer is placed in the Fully Connected layer and classified using softmax.

3.5. Build LetNet neural network and train data set

3.5.1. Preparation before implementation

1) Environment

ROSMASTER-jetson development board series are all installed with the development environment of this project, including:

2) Dataset

CIFAR-10, 50,000 training images of 32*32 size and 10,000 test images

Note: The data set is saved in the ~/Pytorch_demo/data/cifar-10-batches-py directory,

image-20220330103444425

3.5.2. Implementation process

1) Import related modules

2) load the data set

3)encapsulating the data set

4) Build a convolutional neural network

5) Configure the loss function and optimizer for training

6) Start training and testing

3.5.3 Run the program

1) Reference code path

2) Run the program

image-20220301094204119

We only trained twice here. You can modify the epoch value to modify the number of training times. The more training times, the higher the accuracy.