This repository contains the official implementation for the paper:
Formal Analysis on Environment Design for Multi-Robot Navigation using Reinforcement Learning
We develop a scalable MDP-based simulation framework for multi-robot navigation and path planning under continuous control. The framework supports:
- Multi-UAV systems
- Multi-wheeled robot systems
- Parallel training with vectorized environments
- Multiple deep reinforcement learning algorithms
We recommend running this project on Linux.
All required dependencies are provided in environment.yaml.
conda env create -f environment.yaml
conda activate <environment_name>If you modify the environment:
conda env export --no-builds > environment.yamlExperiment configurations are stored in:
exp_sets/
├── uav/
│ └── cont_sets.json
└── wheeled/
└── envX.ini
- UAV environments are loaded from JSON files.
- Wheeled robot environments are loaded from
.inifiles.
Training supports both UAV and wheeled robot environments.
Supported algorithms:
A2CPPOTRPOARSCrossQTQC
python train.py --algorithm CrossQ --robot_type uav --set 3 --num_robots 3python train.py --algorithm PPO --robot_type wheeled_robot --set 1python train.py \
--algorithm {A2C,PPO,TRPO,ARS,CrossQ,TQC} \
--robot_type {uav,wheeled_robot} \
--set [set number] \
--num_robots [int, UAV only] \
--verbose {0,1,2} \
--steps [training steps] \
--num_envs [parallel envs] \
--seed [seed] \
--log_steps [logging interval] \
--resume {True,False} \
--use_tuned_params {True,False} \
--device {cpu,cuda}Training logs are saved in:
logs/
├── training_default_logs/
└── training_best_logs/
Each run creates:
logs/.../<robot>_<algorithm>_setX_seedY_v0/
├── tensorboard/
├── checkpoints/trained_model.zip
├── log.txt
├── progress.csv
└── progress.json
tensorboard --logdir=logspython train.py \
--algorithm CrossQ \
--robot_type uav \
--set 3 \
--resume TrueFor UAVs:
python3 tune.py --algorithm A2C --robot_type uav --set 1 --num_robots 3For Wheeled Robots:
python3 tune.py --algorithm A2C --robot_type wheeled_robot --set 1The currently implemented algorithms are A2C, PPO, TRPO, TQC, ARS, and CrossQ.
The --set parameter depends on the number of sets in the exp_sets directory.
The --num_robots parameter is only used for UAVs, as wheeled robot experiments have a fixed number of robots in the configuration files.
Tuning can be further configured using the following command format:
python3 tune.py \
--algorithm {A2C, PPO, TRPO, TQC, ARS, CrossQ} \
--robot_type {uav, wheeled_robot} \
--set [set number] \
--num_robots [number of UAVs, if robot_type is uav] \
--trials [number of tuning trials] \
--steps [number of training steps per trial] \
--num_envs [number of parallel environments] \
--num_eval_eps [number of episodes for evaluation] \
--seed [random seed] \
--log_steps [logging interval] \
--device {cpu, cuda}Train a model before running simulation, or use the pretrained models.
The command python run.py will run a trained model using 3 robots in Pygame. For running simulation on a custom trained model, use the following command format:
python run.py --path ".\trained_models\uav\cont_env1_2robots_CrossQ.zip" --algorithm CrossQ --robot_type uav --set 1 --num_robots 2We use CoppeliaSim for realistic 3D simulation.
Download from:
https://coppeliarobotics.com/
Open the following file in the CoppeliaSim Simulator:
coppeliasim_envs\uav_common_env.ttt
Important:
- Reopen the scene before every run.
- Do NOT save changes when closing.
The command python run.py --simulate True will run a trained model using 3 robots in CoppeliaSim. For running simulation on a custom trained model, use the following command format:
python run.py --path ".\trained_models\uav\cont_env1_2robots_CrossQ.zip" --algorithm CrossQ --robot_type uav --set 1 --num_robots 2 --simulate TrueSlurm scripts are provided in the directory:
slurm_scripts/
Run:
sbatch slurm_scripts/train_all.shEdit and run:
slurm_scripts/train_one_env.shEdit and run:
slurm_scripts/train_one_alg.shTo visualize experiment layouts, please use the notebooks in the plotting folder.
To ease-up the simulation, we provide various notebooks for both UAVs and wheeled robots in the notebooks folder.

