Skip to content

fix(mail): boot postfix self-heal used short hostname → self-relay loop could return#27

Merged
nechodom merged 1 commit into
mainfrom
fix/postfix-boot-selfrelay-fqdn
Jul 2, 2026
Merged

fix(mail): boot postfix self-heal used short hostname → self-relay loop could return#27
nechodom merged 1 commit into
mainfrom
fix/postfix-boot-selfrelay-fqdn

Conversation

@nechodom

@nechodom nechodom commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Regression found by the Fable-5 audit (mail area)

The boot-time postfix self-heal picked smart-host vs direct-MX with:

host_is_local(&cfg.smtp_host, &agent_hostname)   // agent_hostname = `hostname` (SHORT)

while the runtime mta_reconfigure path uses hostname -f (full FQDN). Trace, host s4, relay set to the node's own fqdn s4.digitalka.cz:

Worse: EmailConfigSet self-restarts the agent, so this buggy boot path re-ran after every mail save, silently undoing the correct decision mta_reconfigure had just made.

Fix

  1. Boot resolves the FQDN once (hostname -f, short-name fallback) before the smart-host decision and passes it to host_is_local — same yardstick as the runtime path. Removes the now-duplicate hostname -f block from the direct-MX arm.
  2. Defense in depth in host_is_local: also catch a relay typed as our full fqdn when we only know our short name (degraded hostname -f), via a short-label compare scoped to that case only — so a legit external relay that merely shares our short label (our mail.acme.com vs mail.sendgrid.net) is still a smart-host once we know our real fqdn.

Test

clippy -D warnings clean; host_is_local self/loopback + new degraded-short-fqdn unit tests green. This was flagged by the audit but its verify stage hit a session limit, so I traced + confirmed it against the code by hand before fixing.

🤖 Generated with Claude Code

…op could return

The boot-time postfix self-heal decided smart-host vs direct-MX by calling
host_is_local(smtp_host, agent_hostname) with the SHORT hostname (`hostname`),
while the runtime mta_reconfigure path correctly uses `hostname -f`. So a relay
set to the node's OWN fqdn (e.g. `s4.example.com` on host `s4`) slipped past
host_is_local at boot into relayhost=[s4.example.com] — the exact
"mail for localhost loops back to myself" self-relay loop #23 fixed. And since
EmailConfigSet self-restarts the agent, that buggy boot path re-ran after every
mail save, silently undoing the correct decision mta_reconfigure had just made.

Fix:
- Boot resolves the FQDN once (via `hostname -f`, short-name fallback) BEFORE
  the smart-host decision and passes it to host_is_local — matching the runtime
  path. Also removes the now-duplicate `hostname -f` block in the direct-MX arm.
- Defense in depth: host_is_local also catches a relay typed as our full fqdn
  when we only know our SHORT name (degraded `hostname -f`), via a short-label
  compare scoped to that case only — a legit external relay sharing our short
  label (our mail.acme.com vs relay mail.sendgrid.net) is still a smart-host
  once we know our real fqdn.

clippy -D warnings clean; host_is_local self/loopback + degraded-short-fqdn
unit tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@nechodom nechodom merged commit c563fd8 into main Jul 2, 2026
1 check passed
@nechodom nechodom deleted the fix/postfix-boot-selfrelay-fqdn branch July 2, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant