Enhance warmup service with lifetime limit and error handling by Dimotai · Pull Request #58 · EliteScouter/EliteEssentials

Dimotai · 2026-03-29T17:06:55Z

Fix: WarmupService poller silently dies from uncaught exception, permanently breaking all teleports

Summary

A single uncaught exception in WarmupService.pollWarmups() permanently kills the warmup poller thread, causing all warmup-based teleports (/home, /warp, /tpa, /spawn, /back, /rtp) to silently stop working for every player on the server until restart. Admin teleports with 0 warmup are unaffected, masking the issue.

Bug Description

WarmupService uses ScheduledExecutorService.scheduleAtFixedRate() to poll pending warmups every 100ms. This is a well-documented Java behavior: if the scheduled Runnable throws any uncaught exception, the executor permanently and silently cancels the recurring task. No error is logged by the executor itself — the task simply stops running.

pollWarmups() currently has no try/catch protection. If world.execute() throws for any warmup entry (e.g., the world was destroyed between the null check and the execute call during dungeon instance teardown), the exception propagates out of pollWarmups(), and the executor kills the task. The warmup system is now dead.

To make it worse, ensurePollerRunning() only checks pollTask.isCancelled() to decide whether to restart the poller. When a scheduled task dies from an exception, isCancelled() returns false — it wasn't cancelled, it crashed. isDone() would return true, but nobody checks it. So even when new players trigger startWarmup(), the method thinks the poller is still running and doesn't restart it.

The result: a single exception permanently bricks the entire teleport warmup system for all players until server restart.

What Players Experience

Players with warmup > 0: They see the warmup countdown message ("Teleporting in 5 seconds... stand still!") because startWarmup() still adds entries to the pending map. But tickWarmup() never runs because the poller is dead, so the teleport never completes. The entry stays in pending forever.
Subsequent attempts: Commands that check hasActiveWarmup() before starting (/home, /warp, /back, /tpa, /rtp) reject the player with "You already have a teleport in progress!" because the stale entry from the first attempt is still in the map.
/spawn: Doesn't check hasActiveWarmup(), so it replaces the pending entry each time — but the new warmup also never completes because the poller is still dead.
Admins with warmup bypass (warmup = 0): Unaffected. Zero-warmup teleports execute immediately in startWarmup() via world.execute(onComplete) without going through the poller at all. This makes the bug appear player-specific when it's actually global.
Restarting the game client: Does not help. The bug is server-side — the poller thread is dead and the pending map state is on the server.

Evidence From Production

This bug was observed on a 50+ player Hytale server (build 2026.03.26) running EliteEssentials 2.0.1. The server had heavy dungeon instance churn (instances being created and destroyed every 30-60 seconds) with frequent cross-world player transfers.

At approximately 16:25 UTC, all player teleports stopped working simultaneously. Server logs show:

World.consumeTaskQueue throwing IllegalStateException: Window id 1 is invalid! — the world task queue was actively failing tasks during this period
Frequent cross-world thread mismatches during dungeon instance transfers — world.execute() calls racing with world destruction
Multiple players reporting the issue simultaneously — confirming it's a global system failure, not per-player
Admin with instant-tp perms could teleport fine — confirming the warmup path specifically is broken
Player reports matching exactly: "it says 'teleporting to spawn in 5 seconds... stand still!' and thats it" (warmup starts, never completes) and "i cant teleport to spawn lol i have you already have a teleport in proGress" (stale pending entry blocking)

No [Warmup] error was logged because the exception was swallowed silently by ScheduledExecutorService.

The Fix

Three changes, all in WarmupService.java:

1. `pollWarmups()` — Two layers of try/catch

An inner per-warmup try/catch ensures one bad warmup (e.g., a destroyed world) can't kill processing for other players. The failed entry is removed and logged. An outer try/catch acts as an absolute last resort so the poller can never die.

2. `ensurePollerRunning()` — Check `isDone()` in addition to `isCancelled()`

// Before (broken):
if (pollTask != null && !pollTask.isCancelled()) { return; }

// After (fixed):
if (pollTask != null && !pollTask.isCancelled() && !pollTask.isDone()) { return; }

This is a defense-in-depth measure. With the try/catch fix, the poller should never die. But if it somehow does, the next startWarmup() call will now correctly detect the dead task and restart it.

3. Stale warmup cleanup — 60-second maximum lifetime

A MAX_WARMUP_LIFETIME_NANOS constant (60 seconds). During each poll cycle, any warmup entry older than 60 seconds is force-removed with a warning log. No legitimate warmup should ever take 60 seconds. This catches edge cases where world.execute() silently drops a task without throwing, leaving the entry stuck in pending forever.

A createdAtNanos field is added to PendingWarmup to support this check.

Impact

Risk: Very low. The fix only adds defensive error handling around existing code paths. No behavioral changes for the happy path.
Backwards compatible: No config changes, no API changes, no message changes.
Files changed: WarmupService.java only.

How to Reproduce

The bug requires world.execute() to throw inside pollWarmups(). The most reliable trigger is heavy dungeon instance churn with cross-world player transfers — a world being destroyed between the world == null check and the world.execute() call. On a busy server with 40+ players running instances, this is a matter of when, not if.

Added a maximum warmup lifetime constant and improved error handling in the polling mechanism to prevent task cancellation due to uncaught exceptions.

Enhance warmup service with lifetime limit and error handling

8673f63

Added a maximum warmup lifetime constant and improved error handling in the polling mechanism to prevent task cancellation due to uncaught exceptions.

EliteScouter merged commit aa07904 into EliteScouter:main Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance warmup service with lifetime limit and error handling#58

Enhance warmup service with lifetime limit and error handling#58
EliteScouter merged 1 commit intoEliteScouter:mainfrom
Dimotai:patch-1

Dimotai commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Dimotai commented Mar 29, 2026

Fix: WarmupService poller silently dies from uncaught exception, permanently breaking all teleports

Summary

Bug Description

What Players Experience

Evidence From Production

The Fix

1. pollWarmups() — Two layers of try/catch

2. ensurePollerRunning() — Check isDone() in addition to isCancelled()

3. Stale warmup cleanup — 60-second maximum lifetime

Impact

How to Reproduce

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `pollWarmups()` — Two layers of try/catch

2. `ensurePollerRunning()` — Check `isDone()` in addition to `isCancelled()`