fix(domain): point _ans-badge DNS record URL at transparency log#23
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds support for publishing _ans-badge DNS TXT records that point to an externally reachable Transparency Log (TL) URL (useful when TL is behind a reverse proxy/CDN).
Changes:
- Extends
domain.ComputeRequiredDNSRecordsto accept atlPublicBaseURLand uses it to build badge URLs. - Wires TL public URL from config → service → handlers and revoke/verify flows.
- Adds config + docs for
tl-client.public-base-urland tests for TL badge URL behavior.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/ra/service/v1event.go | Passes TL public URL into DNS record computation for revoke events |
| internal/ra/service/registration.go | Stores TL public URL on service; adds setter/getter |
| internal/ra/service/lifecycle.go | Uses TL public URL when computing expected DNS records for verify/revoke |
| internal/ra/handler/v1registration.go | Threads TL public URL into v1 pending-block DNS record computation |
| internal/ra/handler/lifecycle.go | Threads TL public URL into v2 detail DTO mapping |
| internal/ra/handler/dto.go | Threads TL public URL into v2 pending-block DNS record computation |
| internal/domain/dnsrecords_test.go | Updates existing tests for new signature; adds TL badge URL tests |
| internal/domain/dnsrecords.go | Implements TL-based badge URL generation |
| internal/config/config.go | Adds public-base-url config and defaults it to base-url |
| config/ra-local.yaml | Documents tl-client.public-base-url option |
| config/ra-docker.yaml | Documents tl-client.public-base-url option |
| cmd/ans-ra/main.go | Wires config TL public URL into service initialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| badgeURL := reg.Endpoints[0].AgentURL | ||
| if tlPublicBaseURL != "" && reg.AgentID != "" { | ||
| badgeURL = strings.TrimRight(tlPublicBaseURL, "/") + "/v1/agents/" + reg.AgentID | ||
| } |
| func (s *RegistrationService) WithTLPublicBaseURL(url string) *RegistrationService { | ||
| s.tlPublicBaseURL = url |
| if c.TLClient.PublicBaseURL == "" { | ||
| c.TLClient.PublicBaseURL = c.TLClient.BaseURL | ||
| } |
|
Thanks for the contribution! The changes look good to me. One blocker before merge: this repo requires GPG-signed commits on main, and the two commits on this branch aren't verified yet. Could you re-sign and force-push? The Signed-off-by: trailers you already have are fine — it's just the GPG signature that's missing. |
There was a problem hiding this comment.
Agree with @csnitker-godaddy - thanks for contributing.
Three things worth addressing before merge — inline comments below.
| BaseURL string `koanf:"base-url"` | ||
| // PublicBaseURL is the TL's externally-reachable URL used in | ||
| // _ans-badge DNS TXT records. When empty, defaults to BaseURL. | ||
| PublicBaseURL string `koanf:"public-base-url"` |
There was a problem hiding this comment.
Doc claims defaulting that doesn't exist in code.
The comment above this field says "When empty, defaults to BaseURL," but RAConfig.Validate() (lines 309-352) has no such defaulting. Operators who pull this PR without setting public-base-url in YAML keep emitting _ans-badge records pointing at the agent's own URL — the exact bug this PR is fixing.
Two clean options:
- Required + validated (recommended): in
Validate(), return an error whenPublicBaseURLis empty, and reject anything that isn'thttps://with no userinfo/query/fragment. - Default to
BaseURL: assignc.TLClient.PublicBaseURL = c.TLClient.BaseURLwhen empty, so behavior matches the doc.
Either way, the field doc and the validator need to agree. The PR description's "Falls back to the agent endpoint URL when public-base-url is unset, preserving backwards compatibility" line should also be revised — the prior behavior was a bug, not a stable contract.
| if len(reg.Endpoints) > 0 { | ||
| badgeURL := reg.Endpoints[0].AgentURL | ||
| if tlPublicBaseURL != "" && reg.AgentID != "" { | ||
| if joined, err := url.JoinPath(tlPublicBaseURL, "v1", "agents", reg.AgentID); err == nil { |
There was a problem hiding this comment.
Silent url.JoinPath error swallow.
The err == nil guard silently falls back to reg.Endpoints[0].AgentURL (the buggy URL this PR is fixing). A typo in public-base-url — missing scheme, embedded query string, stray whitespace — is undetectable until external badge verification fails in production.
Pair this with config-load validation: parse PublicBaseURL once in Validate() and reject anything that isn't a clean https://host[/path]. Once Validate() enforces parseability, this JoinPath becomes a guaranteed-success invariant — either drop the error guard, or convert it to a panic so an internal regression is caught loudly rather than papered over by emitting the wrong URL into the immutable TL.
| }).WithDNSVerifier(dnsVerifier). | ||
| WithServerCertificateAuthority(serverCA) | ||
| WithServerCertificateAuthority(serverCA). | ||
| WithTLPublicBaseURL(cfg.TLClient.PublicBaseURL) |
There was a problem hiding this comment.
No startup log of effective PublicBaseURL.
The TL outbox-worker startup log at lines 276-279 emits tlBaseURL (internal) but not PublicBaseURL. At 3am when oncall is debugging "badges point to the wrong place," they cannot tell pre-PR-23 from post-PR-23-but-misconfigured without SSH-ing the box and reading YAML.
Suggested addition near config load:
if cfg.TLClient.PublicBaseURL == "" {
logger.Warn().Msg("tl-client.public-base-url is unset; _ans-badge DNS records will fall back to agent URL")
} else {
logger.Info().
Str("tlPublicBaseURL", cfg.TLClient.PublicBaseURL).
Str("tlBaseURL", cfg.TLClient.BaseURL).
Msg("transparency log endpoints configured")
}Collapses MTTR for badge-verification incidents from 15-30 minutes (config inspection) to a 30-second log scan.
c826fc8 to
4aafe81
Compare
|
@fobispo-tc - looks good here. It looks like the linter is failing. Once we get a clean build. We will get this merged. Thanks for fixing things up so far. |
The _ans-badge TXT record's url= field was set to the agent's own
endpoint URL instead of the Transparency Log badge endpoint.
Add tl-client.public-base-url config (required, validated as https).
When set, badge records point to <public-base-url>/v1/agents/<agentId>.
Falls back to the agent endpoint when unset.
Signed-off-by: Francisco Obispo <fobispo@tucows.com>
Signed-off-by: Francisco Obispo <fobispo@tucows.com>
4aafe81 to
4f70e18
Compare
|
Thanks for the contribution! 🎉 |
Summary
The
_ans-badgeTXT record'surl=field was incorrectly set to the agent's own endpoint URL instead of the Transparency Log badge endpoint.Before (broken):
After (correct):
Problem
Badge-verifying clients (the SDK's
verifypackage) parse the_ans-badgeTXT record and fetch the badge from theurl=field. The SDK documents this field as "the URL to fetch the badge from the transparency log" (ans-sdk-go/verify/badge_record.go:34), but the RA was populating it with the agent's own endpoint URL fromreg.Endpoints[0].AgentURL. This meant verifiers would try to fetch a badge from the agent itself — which doesn't serve badges — instead of from the TL.Fix
tl-client.public-base-urlconfig field to the RA. This is the externally-reachable TL URL (e.g.https://tl.example.org), which may differ from the internaltl-client.base-url(e.g.http://ans-tl:18081in Docker Compose).ComputeRequiredDNSRecordsnow accepts the TL public URL and constructs the badge URL as<public-base-url>/v1/agents/<agentId>.public-base-urlis unset, preserving backwards compatibility for deployments that haven't configured it yet.Changes
internal/config/config.goPublicBaseURLtoTLClientstruct; default toBaseURLinValidate()internal/domain/dnsrecords.gotlPublicBaseURLparam; use it for badge URL when setinternal/domain/dnsrecords_test.gointernal/ra/service/registration.gotlPublicBaseURLfield +WithTLPublicBaseURL()builder + getterinternal/ra/service/lifecycle.gos.tlPublicBaseURLat 3 call sitesinternal/ra/service/v1event.gos.tlPublicBaseURLat 1 call siteinternal/ra/handler/dto.gomapAgentDetailsandbuildRegistrationPendingBlockinternal/ra/handler/lifecycle.gointernal/ra/handler/v1registration.gocmd/ans-ra/main.gocfg.TLClient.PublicBaseURLvia.WithTLPublicBaseURL()config/ra-local.yamlconfig/ra-docker.yamlConfiguration
When
public-base-urlis omitted, it defaults tobase-url.Test plan
go test ./...— 23 packages, 0 failures)TestComputeRequiredDNSRecords_BadgeURLPointsToTL— verifies badge URL uses TL endpointTestComputeRequiredDNSRecords_BadgeFallbackWithoutTLURL— verifies fallback to agent URLpublic-base-urlset and verify new registrations produce correct_ans-badgerecords