Skip to content

AjDep/VBRAIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision-Based Arm Control System

This project is a final-year style prototype that uses computer vision to capture human hand and pose movement, turn those movements into joint angles, and send the angles to an ESP32-controlled robotic arm. It also includes a backend and frontend for storing, browsing, replaying, and managing recorded angle CSV files.

At a high level, the system works like this:

  1. The Python vision app detects body and hand landmarks with OpenCV and MediaPipe.
  2. The detected movement is converted into angle values and streamed to an ESP32 over TCP.
  3. The ESP32 drives four servos to mirror the motion.
  4. The backend stores angle recordings in MySQL and can replay or start new recordings.
  5. The React frontend provides a simple dashboard to upload, list, preview, download, delete, and replay recordings.

Project Structure

  • frontend/ - React + Vite dashboard for managing recordings.
  • backend/ - Express API for recordings, replay, and Python recording control.
  • ml/ - Python computer vision code for hand, pose, and face processing.
  • arduino/ - ESP32 sketch that receives joint angles and controls servos.
  • models/ - Trained face model files and Haar cascade assets.
  • data/ - Sample media and generated logs used by the Python scripts.
  • src/ - Main Python package for reusable vision modules and demo apps.

What It Does

  • Detects hand and body movement from a webcam feed.
  • Calculates joint angles from MediaPipe landmarks.
  • Smooths motion to reduce jitter before sending commands.
  • Sends angle packets to an ESP32 that controls a 4-servo robotic arm.
  • Stores angle CSV files so they can be replayed later.
  • Lets you manage recordings through a web UI instead of working only from the terminal.

Main Components

Python Vision Code

The Python code under ml/ and src/ contains the computer vision logic. It includes:

  • hand tracking modules
  • pose estimation modules
  • face detection and recognition utilities
  • image processing demos
  • combined apps such as CombinationHandPose.py, which connects vision output to the ESP32 and backend

Backend API

The backend in backend/ is an Express service that:

  • stores recordings in MySQL
  • accepts CSV uploads
  • lists and downloads recordings
  • deletes recordings
  • replays saved recordings
  • can start and stop the Python recording process

It uses an API key for protected routes and is configured through environment variables.

Frontend Dashboard

The frontend in frontend/ is a React dashboard for:

  • uploading recordings
  • viewing saved recordings
  • previewing or downloading CSV files
  • deleting recordings
  • checking backend health
  • starting and stopping recordings from the browser

ESP32 / Arduino

The Arduino sketch in arduino/ connects the ESP32 to Wi-Fi, receives angle packets, and moves four servos with smoothing and safety limits.

Requirements

  • Python 3.10 or newer
  • Node.js 18+ and npm
  • MySQL database for the backend
  • ESP32 board with servo hardware
  • Webcam for the vision scripts

Python dependencies typically include OpenCV, MediaPipe, NumPy, and requests. The backend depends on Express, Multer, MySQL2, CORS, and dotenv. The frontend uses React, Vite, Axios, and React DOM.

Setup

1. Backend

cd backend
npm install

Create a .env file with your database credentials, API key, and any custom paths or network settings. Then start the server:

npm start

By default, the backend listens on port 8000.

2. Frontend

cd frontend
npm install

Create .env.local and set the API URL and API key used by the dashboard. Then run:

npm run dev

3. Python Vision Code

Run the Python scripts from the repository root so relative imports resolve correctly. A few useful entry points are:

python -m src.apps.CombinationHandPose
python -m src.hand.TestingHandTrackingModule
python -m src.pose.TestingPoseModule
python -m src.video.readVid

4. ESP32 Sketch

Open arduino/arduino.ino in the Arduino IDE or PlatformIO, update the Wi-Fi and device settings for your setup, and flash it to the ESP32.

Data And Models

  • Sample media is stored in data/raw/.
  • Captured logs are stored in data/logs/.
  • Trained models and Haar cascade files are stored in models/.
  • Shared path helpers live in src/common/paths.py.

Notes

  • The project is designed as a modular system, so you can use the vision code, backend, frontend, or ESP32 sketch separately if needed.
  • The hand and pose pipeline is the core of the arm control flow.
  • If you add new scripts, keep shared utilities in src/common/ and runnable apps in src/apps/.

Diagrams

  • Class diagram (Mermaid): docs/class-diagram.md
  • Component diagram (Mermaid): docs/component-diagram.md

About

Vision-Based Robotic Arm Imitation System 4-DOF servo arm that mirrors human hand movement in real time. Uses MediaPipe pose estimation + Kalman filtering to smooth joint angles before sending commands to an ESP32. Published at IEEE ISDFS 2026.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors