Skip to content

Aa Processor fixes#225

Merged
jacomago merged 5 commits into
ChannelFinder:masterfrom
jacomago:aa-fixes
Jun 2, 2026
Merged

Aa Processor fixes#225
jacomago merged 5 commits into
ChannelFinder:masterfrom
jacomago:aa-fixes

Conversation

@jacomago
Copy link
Copy Markdown
Contributor

  • Add a connection timeout
  • Skip archivers that are not available
  • log a warning when pvs are not resumed

…urious ARCHIVE

Previously, a failed HTTP status batch silently returned an empty list.
Those PVs were then treated as "not archived" and submitted for archiving
on every subsequent run, spamming the archiver with duplicate requests.

ArchiverService now throws ArchiverServiceException on status fetch failure.
AAChannelProcessor catches it in getArchiveActions() and returns null,
causing submitToArchivers() to skip that archiver entirely for the run.

Also renames getStatusesFromPvListQuery/Body → getStatusesViaGet/Post
to reflect their transport semantics

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@anderslindho
Copy link
Copy Markdown
Contributor

Can you please fix the 3 new sonar issues?

georgweiss
georgweiss previously approved these changes May 29, 2026
…fail fast

Without a connect timeout, TCP SYN drops (firewall, no route) stall
for the OS default (~2 min) before returning. Setting connect timeout
equal to the configured read timeout ensures both slow and unreachable
archivers fail at the same bounded deadline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jacomago
Copy link
Copy Markdown
Contributor Author

Can you please fix the 3 new sonar issues?

done

Copy link
Copy Markdown
Contributor

@anderslindho anderslindho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing I didn't like was the log of failed fetch N channels. But claude found more issues that I want you to look into:

  1. Null guard returns wrong Optional (AAChannelProcessor:271) — The archiverInfo == null check inside getArchiveActions returns Optional.of(Map.of()) (non-empty) instead of Optional.empty(). Currently
    dead code due to the outer guard in submitToArchivers, but if that guard is removed, the caller proceeds to archiverInfo.url() and NPEs silently instead of triggering the skip/warn path.
  2. Connect-timeout has no test (ArchiverServiceTest:374) — The new test accepts the TCP connection before hanging, so it exercises setReadTimeout, not setConnectTimeout. Deleting the new
    setConnectTimeout(...) line from production would leave all tests green.
  3. Double WARNING per failure (AAChannelProcessor:302) — A single ArchiverServiceException produces two separate Level.WARNING log entries (the getArchiveActions catch + warnSkippedActivePvs). Alerting
    systems counting by severity will double-fire.
  4. Misleading "RESUME was not sent" for non-paused PVs (AAChannelProcessor:292) — The warning includes all Active PVs, not just the ones that were paused; first-time-archive PVs get a misleading
    "remain paused" message.
  5. Deprecated setConnectTimeout(int) (ArchiverService:67) — PR adds a second call using the deprecated int form instead of fixing both to Duration.ofSeconds(timeoutSeconds).

jacomago and others added 2 commits June 1, 2026 13:06
Makes clearer the response, and avoids jumping into the empty map
…outs

Point the connect-timeout test at TEST-NET-1 (192.0.2.1) so the SYN is
dropped and the connect timeout actually fires, rather than a closed local
port that returns connection-refused instantly. Give the connect and read
timeouts distinct values so each test's elapsed time proves which timeout
fired.

Add aa.connect_timeout_seconds (default 5s) for fail-fast on unreachable
archivers; aa.timeout_seconds remains the read timeout, unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jacomago
Copy link
Copy Markdown
Contributor Author

jacomago commented Jun 1, 2026

Only thing I didn't like was the log of failed fetch N channels. But claude found more issues that I want you to look into:

1. Null guard returns wrong Optional (AAChannelProcessor:271) — The archiverInfo == null check inside getArchiveActions returns Optional.of(Map.of()) (non-empty) instead of Optional.empty(). Currently
   dead code due to the outer guard in submitToArchivers, but if that guard is removed, the caller proceeds to archiverInfo.url() and NPEs silently instead of triggering the skip/warn path.

Fixed

2. Connect-timeout has no test (ArchiverServiceTest:374) — The new test accepts the TCP connection before hanging, so it exercises setReadTimeout, not setConnectTimeout. Deleting the new
   setConnectTimeout(...) line from production would leave all tests green.

Added a test and made connection timeout separate, but not sure about the test since it uses a test IP which looks flaky. Will see if passes in CI.

3. Double WARNING per failure (AAChannelProcessor:302) — A single ArchiverServiceException produces two separate Level.WARNING log entries (the getArchiveActions catch + warnSkippedActivePvs). Alerting
   systems counting by severity will double-fire.

Dropped that commit.

4. Misleading "RESUME was not sent" for non-paused PVs (AAChannelProcessor:292) — The warning includes all Active PVs, not just the ones that were paused; first-time-archive PVs get a misleading
   "remain paused" message.

In commit that's dropped.

5. Deprecated setConnectTimeout(int) (ArchiverService:67) — PR adds a second call using the deprecated int form instead of fixing both to Duration.ofSeconds(timeoutSeconds).

I don't know why it thinks its deprecated: https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/http/client/SimpleClientHttpRequestFactory.html

@jacomago jacomago requested a review from anderslindho June 1, 2026 11:42
@anderslindho
Copy link
Copy Markdown
Contributor

Please fix introduced sonar issue 🙃

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 2, 2026

@jacomago jacomago merged commit efc3fec into ChannelFinder:master Jun 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants