This repository contains the source code and experimental setup for a Bachelor's thesis project that implements and evaluates the accuracy of two distinct 6D pose estimation methods:
- Marker-based: ArUco approach
- Marker-less: FoundationPose model
The core of the experiment involves using a Universal Robots UR3e robotic arm to provide precise ground truth data. The robot moves a test object through a predefined 3D grid while a camera captures the scene. The system compares the estimated poses from the vision models against the robot's known poses.
The setup includes:
- UR3e robotic arm: Holds the test object.
- Intel RealSense D435i: Captures RGB-D data.
- Computer: Controls all processing.
- LAN Router: Enables communication between components.
- A main Python script orchestrates the flow.
- TCP/IP via RTDE handles communication between the PC and UR3e.
- Local sockets handle communication between Python processes.
| Component | ArUco | FoundationPose | Purpose |
|---|---|---|---|
| Robotic Arm | UR3e | UR3e | Ground truth generation |
| Camera | Intel RealSense D435i | Intel RealSense D435i | Image & Depth data acquisition |
| Computer | i7-7700HQ, 16 GB RAM | i9-10850K, 128 GB RAM | Data processing & model inference |
| GPU | GTX 1060 | RTX 3090 (24 GB VRAM) | Model training & inference |
| Network | LAN Router | LAN Router | PC ↔ Robot communication |
- OS: Windows 10 (ArUco), Ubuntu 22.04.5 LTS (FoundationPose)
- Python: 3.10 (via Anaconda)
- Libraries:
OpenCV,NumPy,rtde-client - Virtualization: Docker (required for FoundationPose)
Dependencies:
- NVIDIA Drivers & CUDA Toolkit (12.1+)
- FoundationPose, BundleSDF, and SAM repositories
- Assemble hardware as shown above.
- Connect the computer and UR3e to the same LAN.
- Assign static IPs (as per the thesis or router config).
- Ensure the PC can ping the robot.
# Clone the repo
git clone https://github.com/MaxPett/ArUco-marker-FoundationPose-UR3e
cd BSc-MMB
# Create and activate a virtual environment
conda create -n pose_env python=3.10
conda activate pose_env
# Install required packages from requirements.txt
pip install -r requirements.txtInstall RTDE Library to the UR3e folder and follow the installation guide from:
- Install Docker and NVIDIA Container Toolkit on your Ubuntu machine.
Copy the following files from the UR3e/ folder to a USB drive, and load them onto the UR3e control panel:
rtde_control_loop.urpcontrol_loop_configuration.xml
Follow the installation guides from:
python gen_pattern.py # Generates checkerboard.svg- Print
checkerboard.svgon A4 paper. - Attach to the UR3e end-effector.
- Run
calibrate.py, move robot through 15–20 poses. - Press spacebar to capture, ESC to finish.
- Calibration file is saved to
calibration/.
python run_pose_estimation.pyUser Interface Prompts:
- Pose Type: ArUco or FoundationPose
- Robot IP
- ArUco tag type (e.g.,
DICT_7X7_50) & size - Object name
Automated Scripts:
generate_aruco_tags.py(if ArUco selected)ur3e_control_loop.py(launches robot control)pose_estimation.pyorfoundation_pose.py(based on method)
On the UR3e panel:
- Run the
rtde_control_loop.urpprogram and press Start.
The robot executes 351 reference poses and triggers the PC for each.
- Output:
.jpgimages - Filename includes both ground truth and estimated pose:
pose_estimation/{object_name}/{object_name}_{ground_truth_pose}_{estimated_pose}.jpg
- Output:
.jsonlog with timestamps and ground truth poses:
pose_estimation/{object_name}/{object_name}.json
Main orchestrator that:
- Launches UI
- Calls calibration scripts
- Starts robot control and pose estimation
- Captures images on robot triggers
- Estimates 6D pose with OpenCV ArUco
- Saves frames with ground truth and estimated data in filename
- Logs timestamp and ground truth pose from robot
- Saves to
.jsonfor offline FoundationPose evaluation
The results and conclusions of the study, along with the limitations outlined below, can be reviewed at: 6D Object Pose Estimation Using Classical and Deep Learning Approaches @ CPS @ Montanuiversität Leoben
- There is likely an issue with the calculated rotated positions at the reference points. These may not accurately reflect the intended orientations and could affect evaluation metrics for both methods.