8. 3D Object Recognition

8.1. Introduction

MediaPipe is an open-source data stream processing machine learning application development framework developed by Google. It is a graph-based data processing pipeline used to build data sources in various forms, such as video, audio, sensor data, and any time series data. MediaPipe is cross-platform and can run on embedded platforms (such as Raspberry Pi), mobile devices (iOS and Android), workstations and servers, and supports mobile GPU acceleration. MediaPipe provides cross-platform, customizable ML solutions for real-time and streaming media.

The core framework of MediaPipe is implemented in C++ and provides support for languages such as Java and Objective C. The main concepts of MediaPipe include packets, streams, calculators, graphs, and subgraphs.

Features of MediaPipe:

8.2, 3D object recognition

3D object recognition: Recognizable objects are: ['Shoe', 'Chair', 'Cup', 'Camera'], a total of 4 categories; click [f key] to switch the recognized object.

8.2.1, Start

Start the camera

Terminal input,

image-20231016105935750

8.2.2, Source code

Source code location: /home/yahboom/ascam_ws/src/yahboomcar_mediapipe/scripts/08_Objectron.py