10. Mediapipe gesture control robotic arm action group

10.1. Introduction

MediaPipe is a data stream processing machine learning application development framework developed by Google and open source. It is a graph-based data processing pipeline for building data sources using many forms, such as video, audio, sensor data, and any time series data. MediaPipe is cross-platform and can run on embedded platforms (Raspberry Pi, etc.), mobile devices (iOS and Android), workstations and servers, and supports mobile GPU acceleration. MediaPipe provides cross-platform, customizable ML solutions for live and streaming media.

10.2. Use

Note: [R2] on the remote control handle has the [pause/start] function for this gameplay.

The case in this section may run very slowly on the robot main control. It is recommended to connect the camera to the virtual machine and run the [02_PoseCtrlArm.launch] file. The NX main control effect will be better. You can try it.

<PI5 needs to open another terminal to enter the same docker container

image-20240408144126098

After the program is running, press the R2 key of the handle to touch the control. The camera will capture the image and there are six gestures, as follows

Here, after each gesture is completed, it will return to the initial position and beep, waiting for the next gesture recognition.

MediaPipe Hands infers the 3D coordinates of 21 hand-valued joints from a frame.

hand_landmarks

 

 

10.4. Core files

10.4.1、mediaArm.launch
10.4.2, FingerCtrl.py

The implementation process here is also very simple. The main function opens the camera to obtain data and then passes it to the process function, which performs "detect palm" -> "get finger coordinates" -> "get gesture" in order, and then determine the needs based on the gesture results. The action performed.

10.5. Flowchart

image-20220613184104987