6. Fingertip trajectory recognition

6.1. Introduction

MediaPipe is an open-source data stream processing machine learning application development framework developed by Google. It is a graph-based data processing pipeline used to build data sources in various forms, such as video, audio, sensor data, and any time series data.

MediaPipe is cross-platform and can run on embedded platforms (Raspberry Pi, etc.), mobile devices (iOS and Android), workstations and servers, and supports mobile GPU acceleration. MediaPipe provides cross-platform, customizable ML solutions for real-time and streaming media.

The core framework of MediaPipe is implemented in C++ and provides support for languages such as Java and Objective C. The main concepts of MediaPipe include packets, streams, calculators, graphs, and subgraphs.

Features of MediaPipe:

6.2, Startup

6.2.1, Program Description

After the program is started, the camera captures the image, put the palm flat in the camera screen, open the fingers, and the palm faces the camera, similar to the gesture of the number 5, and the image will draw the joints on the entire palm. Adjust the position of your palm and try to position it at

image-20240709151729612

At this time, the index finger remains unchanged and the other fingers are retracted, similar to the gesture of the number 1.

image-20240709153155463

While keeping the gesture 1 unchanged, move the position of the finger, and a red line will appear on the screen to draw the path of the index finger.

image-20240709153231185

When the figure is drawn, open all fingers, similar to the gesture of the number 5, and the drawn figure will be generated below.

image-20240709153342249

Note: The drawn graphics need to be closed, otherwise some content may be missing.

6.2.2, Program Startup

6.2.3, Source Code

Code path: ~/jetcobot_ws/src/jetcobot_mediapipe/jetcobot_mediapipe/FingerTrajectory.py