Mediapipe development

Mediapipe development1. Introduction2. Use2.1. Hand detection2.2, Posture detection2.3. Overall detection2.4, Face Detection2.5, Face Recognition2.6, Face effects2.7, 3D object recognition2.8, Brush2.9, Finger control2.10, Gesture Recognition

1. Introduction

MediaPipe is a data stream processing machine learning application development framework developed and open sourced by Google. It is a graph-based data processing pipeline for building applications that use a variety of data sources, such as video, audio, sensor data, and any time series data. MediaPipe is cross-platform and can run on embedded platforms (Raspberry Pi, etc.), mobile devices (iOS and Android), workstations, and servers, and supports mobile GPU acceleration. MediaPipe provides cross-platform, customizable ML solutions for real-time and streaming media.

The core framework of MediaPipe is implemented in C++ and provides support for languages such as Java and Objective C. The main concepts of MediaPipe include packets, streams, calculators, graphs, and subgraphs.

Features of MediaPipe:

Deep learning solution in MediaPipe

Face DetectionFace MeshIrisHandsPoseHolistic
face_detectionface_meshirishandposehair_segmentation
Hair SegmentationObject DetectionBox TrackingInstant Motion TrackingObjectronKNIFT
hair_segmentationobject_detectionbox_trackinginstant_motion_trackingobjectronknift
 AndroidiOSC++PythonJSCoral
Face Detection
Face Mesh 
Iris   
Hands 
Pose 
Holistic 
Selfie Segmentation 
Hair Segmentation    
Object Detection  
Box Tracking   
Instant Motion Tracking     
Objectron  
KNIFT     
AutoFlip     
MediaSequence     
YouTube 8M     

2. Use

You need to enter Docker to run the case. The ROS1 environment is located in the Docker image.

image-20250116093327534

2.1. Hand detection

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116093342355

2.2, Posture detection

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116093544751

2.3. Overall detection

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116093704543

2.4, Face Detection

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116093738295

2.5, Face Recognition

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116093827457

2.6, Face effects

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116093913451

2.7, 3D object recognition

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

Click the camera preview screen and press the f key to switch the recognized model.

image-20250116093932913

2.8, Brush

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116094028393

2.9, Finger control

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116094050376

2.10, Gesture Recognition

Source code location: ~/yahboomcar_ws/src/yahboomcar_mediapipe/scripts

image-20250116094108654