Skip to content

[Networking] TurtleBot4 on university network (multicast blocked) — odom→base_link TF discovery too slow over Tailscale VPN causing Nav2 activation failure #689

@SharmiliRudraraju98

Description

@SharmiliRudraraju98

Robot Model

Turtlebot4 Standard

ROS distro

Jazzy

Networking Configuration

Simple Discovery

OS

Ubuntu 24.04

Built from source or installed?

Installed

Package version

ros-jazzy-turtlebot4-navigation: 2.0.1-1noble.20241028.132320
ros-jazzy-turtlebot4-bringup: 2.0.1-2noble.20241028.133441
ros-jazzy-turtlebot4-node: 2.0.1-1noble.20241018.082137

Type of issue

Networking

Expected behaviour

Requirement: Need to run TurtleBot4 SLAM and Nav2 from a laptop
on a university network where multicast is blocked. The robot and
laptop must communicate over the network without multicast.

Current approach: Using Tailscale VPN with FastDDS ROS_STATIC_PEERS
and initialPeersList XML for unicast DDS discovery.

Expected: With this setup, Nav2 should successfully activate all
lifecycle nodes including local_costmap, with the full TF chain
(map→odom→base_link) available to all ROS 2 nodes on the laptop.

Question: Is Tailscale the recommended approach for this type of
network setup? Are there better alternatives that work reliably
with TurtleBot4 Jazzy without breaking Create3 ↔ RPi communication?

Actual behaviour

Most topics work correctly over Tailscale:

  • /scan: ~7.7Hz ✓
  • /odom: ~20Hz ✓
  • /tf: ~27Hz ✓
  • /map: ~1Hz ✓ (slam_toolbox running fine)

However, Nav2 fails to activate every time:

  • The odom→base_link transform (published by Create3) takes 15-20
    seconds to be discovered by new DDS participants on the laptop
  • Nav2's local_costmap times out after 60 seconds waiting for
    base_link→odom and aborts bringup
  • Verified: ros2 topic echo /tf | grep "child_frame_id: base_link"
    takes 15-20 seconds before any output appears on laptop
  • Bandwidth loss observed: robot publishes /tf at 12.4 KB/s but
    laptop only receives 8.7 KB/s (~30% loss across Tailscale)
  • odom→base_link buffer_length on laptop only ~0.86 seconds
    (confirmed via ros2 run tf2_tools view_frames)

Error messages

[local_costmap.local_costmap]: Timed out waiting for transform from base_link to odom to become available, tf error: Invalid frame ID "base_link" passed to canTransform argument source_frame - frame does not exist
[lifecycle_manager_navigation]: Failed to change state for node: controller_server
[lifecycle_manager_navigation]: Failed to bring up all requested nodes. Aborting bringup.
[local_costmap.local_costmap]: Failed to activate local_costmap because transform from base_link to odom did not become available before timeout

To Reproduce

  1. Set up TurtleBot4 Standard on a university network where multicast is blocked
  2. Install Tailscale on both robot (RPi4) and laptop
  3. Configure FastDDS with ROS_STATIC_PEERS and initialPeersList XML
    pointing to each device's Tailscale IP
  4. On laptop: source ROS 2, launch slam.launch.py, wait for
    "Registering sensor: [Custom Described Lidar]"
  5. On laptop: launch nav2.launch.py
  6. Observe Nav2 repeatedly timing out on base_link→odom transform
    and aborting after 60 seconds
  7. Verify issue: run ros2 topic echo /tf | grep "child_frame_id: base_link"
    on laptop — output takes 15-20 seconds to appear

Other notes

Additional context

What I tried:

  • Started with ROS_DISCOVERY_SERVER — broke Create3 ↔ RPi communication
  • Tried Zenoh bridge (rmw_zenoh_cpp) — also risks breaking Create3 local multicast
  • Settled on Tailscale VPN + FastDDS ROS_STATIC_PEERS + initialPeersList XML
    as it preserves Create3 ↔ RPi local multicast while enabling unicast WAN discovery

Key diagnostics:

  • ros2 topic bw /tf: robot sends 12.4 KB/s, laptop receives only 8.7 KB/s (~30% loss)
  • ros2 run tf2_tools view_frames: odom→base_link buffer_length only ~0.86s on laptop
  • ros2 topic echo /tf | grep "child_frame_id: base_link": takes 15-20s to appear

Questions for maintainers:

  1. Is Tailscale the recommended VPN for TurtleBot4 on restricted networks?
  2. Is there a better FastDDS configuration for WAN/unicast setups?
  3. Can Nav2's local_costmap activation timeout be increased via parameters?
  4. Is there any known fix for slow DDS participant discovery over VPN/WAN?
  5. Open to hear any better approach to handle this network condition where multicast is blocked

Configuration files attached:

laptop_fastdds_tailscale.xml
robot_fastdds_tailscale.xml

  • Robot /etc/turtlebot4/setup.bash additions:
    export ROS_STATIC_PEERS="<LAPTOP_TAILSCALE_IP>"
    export FASTRTPS_DEFAULT_PROFILES_FILE="/home/ubuntu/fastdds_tailscale.xml"

    Laptop /.bashrc additions:
    export ROS_DOMAIN_ID=0
    export RMW_IMPLEMENTATION=rmw_fastrtps_cpp
    export ROS_STATIC_PEERS="<ROBOT_TAILSCALE_IP>"
    export FASTRTPS_DEFAULT_PROFILES_FILE=~/fastdds_tailscale.xml

Image

Metadata

Metadata

Labels

troubleshootingSystem not working as expected, may be user error.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions