This project is a final-year style prototype that uses computer vision to capture human hand and pose movement, turn those movements into joint angles, and send the angles to an ESP32-controlled robotic arm. It also includes a backend and frontend for storing, browsing, replaying, and managing recorded angle CSV files.
At a high level, the system works like this:
- The Python vision app detects body and hand landmarks with OpenCV and MediaPipe.
- The detected movement is converted into angle values and streamed to an ESP32 over TCP.
- The ESP32 drives four servos to mirror the motion.
- The backend stores angle recordings in MySQL and can replay or start new recordings.
- The React frontend provides a simple dashboard to upload, list, preview, download, delete, and replay recordings.
frontend/- React + Vite dashboard for managing recordings.backend/- Express API for recordings, replay, and Python recording control.ml/- Python computer vision code for hand, pose, and face processing.arduino/- ESP32 sketch that receives joint angles and controls servos.models/- Trained face model files and Haar cascade assets.data/- Sample media and generated logs used by the Python scripts.src/- Main Python package for reusable vision modules and demo apps.
- Detects hand and body movement from a webcam feed.
- Calculates joint angles from MediaPipe landmarks.
- Smooths motion to reduce jitter before sending commands.
- Sends angle packets to an ESP32 that controls a 4-servo robotic arm.
- Stores angle CSV files so they can be replayed later.
- Lets you manage recordings through a web UI instead of working only from the terminal.
The Python code under ml/ and src/ contains the computer vision logic. It includes:
- hand tracking modules
- pose estimation modules
- face detection and recognition utilities
- image processing demos
- combined apps such as
CombinationHandPose.py, which connects vision output to the ESP32 and backend
The backend in backend/ is an Express service that:
- stores recordings in MySQL
- accepts CSV uploads
- lists and downloads recordings
- deletes recordings
- replays saved recordings
- can start and stop the Python recording process
It uses an API key for protected routes and is configured through environment variables.
The frontend in frontend/ is a React dashboard for:
- uploading recordings
- viewing saved recordings
- previewing or downloading CSV files
- deleting recordings
- checking backend health
- starting and stopping recordings from the browser
The Arduino sketch in arduino/ connects the ESP32 to Wi-Fi, receives angle packets, and moves four servos with smoothing and safety limits.
- Python 3.10 or newer
- Node.js 18+ and npm
- MySQL database for the backend
- ESP32 board with servo hardware
- Webcam for the vision scripts
Python dependencies typically include OpenCV, MediaPipe, NumPy, and requests. The backend depends on Express, Multer, MySQL2, CORS, and dotenv. The frontend uses React, Vite, Axios, and React DOM.
cd backend
npm installCreate a .env file with your database credentials, API key, and any custom paths or network settings. Then start the server:
npm startBy default, the backend listens on port 8000.
cd frontend
npm installCreate .env.local and set the API URL and API key used by the dashboard. Then run:
npm run devRun the Python scripts from the repository root so relative imports resolve correctly. A few useful entry points are:
python -m src.apps.CombinationHandPose
python -m src.hand.TestingHandTrackingModule
python -m src.pose.TestingPoseModule
python -m src.video.readVidOpen arduino/arduino.ino in the Arduino IDE or PlatformIO, update the Wi-Fi and device settings for your setup, and flash it to the ESP32.
- Sample media is stored in
data/raw/. - Captured logs are stored in
data/logs/. - Trained models and Haar cascade files are stored in
models/. - Shared path helpers live in
src/common/paths.py.
- The project is designed as a modular system, so you can use the vision code, backend, frontend, or ESP32 sketch separately if needed.
- The hand and pose pipeline is the core of the arm control flow.
- If you add new scripts, keep shared utilities in
src/common/and runnable apps insrc/apps/.
- Class diagram (Mermaid):
docs/class-diagram.md - Component diagram (Mermaid):
docs/component-diagram.md