2. TensorFlow

2.1. What is TensorFlow

2.1.1. Definition

TensorFlow™ is a symbolic mathematical system based on dataflow programming (dataflow programming), which is widely used in the programming implementation of various machine learning (machine learning) algorithms. Its predecessor is Google's neural network algorithm library DistBelief.

Tensorflow has a multi-layer structure and can be deployed on various servers, PC terminals and web pages. It also supports GPU and TPU high-performance numerical computing. It is widely used in Google's internal product development and scientific research in various fields.

2.1.2 Core Components

The core components (core runtime) of distributed TensorFlow include: distributed master, dataflow executor/worker service, kernel implementation, and the bottom device layer (device layer)/network layer (networking layer).

image-20220223174953830

1) Distributed master

The distribution center cuts subgraphs from the input data flow graph, divides them into operation fragments and starts the executor. When processing the data flow graph, the distribution center performs preset operation optimizations, including common subexpression elimination, constant folding, etc.

2) The executor is responsible for running graph operations in processes and devices, and sending and receiving results from other executors. Distributed TensorFlow has a parameter server to aggregate and update model parameters returned by other executors. When scheduling local devices, the executor will choose to perform parallel computing and GPU acceleration.

3) The kernel application is responsible for a single graph operation, including mathematical calculations, array operations, control flow, and state management operations. The kernel application uses Eigen to perform parallel computing of tensors, cuDNN library to perform GPU acceleration, and gemmlowp to perform low numerical precision calculations. In addition, users can register additional kernels (fused kernels) in the kernel application to improve the efficiency of basic operations, such as activation functions and their gradient calculations.

2.2, TensorFlow 2

2.2.1, Introduction

TensorFlow is an open source deep learning tool released by Google in November 2015. We can use it to quickly build deep neural networks and train deep learning models. The main purpose of using TensorFlow and other open source frameworks is to provide us with a **module toolbox that is more conducive to building deep learning networks, so that the code can be simplified during development, and the final model presented is more concise and easy to understand.

2.2.2, Upgrade direction

1), Use Keras and Eager Execution to easily build models.

2), Achieve robust production environment model deployment on any platform.

3), Provide powerful experimental tools for research.

4), Simplify the API by cleaning up abandoned APIs and reducing duplication.

2.3, TensorFlow basic concept syntax

2.3.1, Tensor

1), Tensor is the core data unit of TensorFlow, which is essentially an array of any dimension. We call a 1-dimensional array a vector, a 2-dimensional array a matrix, and a tensor can be regarded as an N-dimensional array.

2), In TensorFlow, each Tensor has two basic attributes: data type (default: float32) and shape. The data types are roughly shown in the following table,

Tensor typeDescription
tf.float3232-bit floating point number
tf.float6464-bit floating point number
tf.int6464-bit signed integer
tf.int3232-bit signed integer
tf.int1616-bit signed integer
tf.int88-bit signed integer
tf.uint88-bit unsigned integer
tf.stringVariable-length byte array
tf.boolBoolean
tf.complex64Real and imaginary numbers

3) According to different uses, there are mainly 2 types of tensors in TensorFlow, namely

4), define a variable Tensor

Create a new python file, name it Tensor_Variable, and then give it execution permissions,

Paste the following code into it,

Run the test,

Note: ROSMASTER must use python3 to use tensorflow2.0 or above

Output,

5) Define a constant Tensor

Create a new python file, name it Tensor_constant, and then give it execution permissions,

Paste the following code into it,

Run the test,

Output,

If you look closely, you will find that the output tensor has three attributes: shape, data type dtype, and NumPy array.

6) Commonly used methods for creating special constant tensors:

Example: c = tf.zeros([2, 2]) # 2x2 constant Tensor with all 0s

Output,

Example: v = tf.ones_like(c) # Create a constant Tensor with all 1s in the same shape as tensor c. Note that the shape here refers to the attribute shape of the tensor

Output,

Example: a = tf.fill([2, 3], 6) # 2x3 constant Tensor with all 6s

Output,

Example: c = tf.linspace(1.0, 10.0, 5, name="linspace")

Output,

Example: c = tf.range(start=2, limit=8, delta=2)

Output,

For this part of the code, please refer to:

2.3.2. Use the Eager Execution (dynamic graph) mechanism to operate on tensors

1) Changes in the tensor operation mechanism between TensorFlow2 and TensorFlow1.x

The dynamic graph mechanism is the biggest difference between TensorFlow2.x and TensorFlow1.x. This is similar to PyThorch, simplifying the code and execution process.

2) Take tensor addition as an example,

Create a new python file, name it tensor_plus, and then give it execution permissions,

Paste the following code into it,

Run the test,

image-20220224102821953

It can be found that the execution process of this python is the same, and if it is in Tersorflow1.x, you also need to establish a session, and the session executes the addition operation inside.

3), Common APIs of TensorFlow:

For usage of these common APIs, please refer to the official documentation:

Module: tf | TensorFlow Core v2.8.0 (google.cn)

2.3.3 Neural Network

1) A neural network is a mathematical model that exists in the computer's nervous system. It is composed of a large number of neurons that are connected and perform calculations. Based on external information, it changes the internal structure and is often used to model the complex relationship between input and output. A basic neural network structure has an input layer, a hidden layer, and an output layer. The following figure is a neural network diagram.

image-20220427131059873

2) Input layer: receiving sensory information;

3) Hidden layer: processing input information;

4) Output layer: outputting the computer's understanding of input information.

2.3.4. Building and training a neural network (kear)

1) Import data

For more information about data, please refer to: tf.data.Dataset | TensorFlow Core v2.8.0 (google.cn)

2) Define a structural network that describes a neural network

Example:

The input parameters represent the network structure from the input layer to the output layer, which generally have the following three:

For reference, please refer to the official document: tf.keras.layers.Flatten | TensorFlow Core v2.8.0 (google.cn)

For reference, please refer to the official document: tf.keras.layers.Dense | TensorFlow Core v2.8.0 (google.cn)

For reference, please refer to the official document: tf.keras.layers.Conv2D | TensorFlow Core v2.8.0 (google.cn)

3) Configure the training method for training neural networks

Example:

The input parameters are composed of the following three parts:

Mainly set the learning rate lr, learning decay rate decay and momentum parameters.

Multiple accuracy rates can be specified.

For the specific values ​​of the three parameters, please refer to: tf.keras.Model | TensorFlow Core v2.8.0 (google.cn)

4) Execute the training process

Example:

For specific parameter settings, please refer to: tf.keras.Model | TensorFlow Core v2.8.0 (google.cn)

5), print network structure and parameter statistics

For specific parameter settings, please refer to:tf.keras.Model | TensorFlow Core v2.8.0 (google.cn)

2.3.4. Training Neural Network Example - Classic Training Cat and Dog Image Example

1) Code Path Reference

2) Run the program

3) Screenshot of program running

When the kitten and puppy photos appear, press the q key in the picture display window to continue executing the program.

image-20220228121929519

image-20220228122605151

The model has 10 training epochs and 10 sample batches. The coordinate curve on the left shows that as the number of training times increases, the accuracy acc and error loss will rise and fall.