Skip to content

Ephemeral container expiration improvements#782

Open
pditommaso wants to merge 17 commits intomasterfrom
extend-container-expire
Open

Ephemeral container expiration improvements#782
pditommaso wants to merge 17 commits intomasterfrom
extend-container-expire

Conversation

@pditommaso
Copy link
Copy Markdown
Collaborator

@pditommaso pditommaso commented Dec 26, 2024

Summary

Ephemeral containers are accessible for a fixed period of time (a few hours) since their request.

However there are use cases (e.g. long workflow runs) that may require to be able to access such container for a longer period of time.

This PR extends the current logic so that when a container is requested by a workflow run submitted via Platform the container expiration time is extended as long as the corresponding workflow execution is running.

  • Adds automatic extension of ephemeral container token TTL for long-running workflows. A background watcher periodically checks if a container request's associated workflow is still active on Tower, and if so, extends the token's expiration up to a configurable maximum duration.
  • Introduces ContainerRequestConfig to centralize token cache and watcher configuration, ContainerRequestRange as a sorted range store for scheduling refresh checks, and Workflow/DescribeWorkflowResponse models for Tower API integration.
  • Adds expiration field to WaveContainerRecord and displays it in the container view.

Test plan

  • ContainerRequestRangeTest — verifies entries are stored and retrieved by time range
  • ContainerRequestServiceImplTest — covers eviction logic
  • DescribeWorkflowResponseTest — verifies serialization
  • TowerClientTest — verifies workflowDescribeEndpoint URL construction
  • ClientCacheTest — verifies new DescribeWorkflowResponse/Workflow subtypes in polymorphic factory
  • Verify watcher extends token TTL when workflow is RUNNING/SUBMITTED
  • Verify watcher stops extending when workflow completes or max duration is reached

@munishchouhan
Copy link
Copy Markdown
Member

@pditommaso Can we move forward with this PR?
or is it blocked by some other changes?

# Conflicts:
#	src/main/groovy/io/seqera/wave/configuration/SsrfConfig.groovy
#	src/main/groovy/io/seqera/wave/encoder/DateTimeAdapter.groovy
#	src/main/groovy/io/seqera/wave/service/license/LicenseManValidator.groovy
#	src/main/groovy/io/seqera/wave/service/request/ContainerRequestStoreImpl.groovy
#	src/main/groovy/io/seqera/wave/tower/client/DescribeWorkflowResponse.groovy
@munishchouhan munishchouhan self-assigned this Apr 7, 2026
munishchouhan and others added 4 commits April 7, 2026 17:19
Signed-off-by: munishchouhan <hrma017@gmail.com>
Signed-off-by: munishchouhan <hrma017@gmail.com>
Signed-off-by: munishchouhan <hrma017@gmail.com>
@munishchouhan
Copy link
Copy Markdown
Member

relevance:

Without this feature: Container tokens have a fixed TTL (36h on master). If a workflow runs longer than that, the token expires and the container becomes inaccessible — the
workflow fails or can't pull its container.

With this feature: A background watcher checks if the workflow is still running on Tower. If it is, the token TTL gets extended automatically (up to a max of 2 days). Once the workflow finishes, the token expires normally.

It's essentially a keep-alive mechanism — tokens stay valid as long as the workflow needs them, but don't linger forever.

Signed-off-by: munishchouhan <hrma017@gmail.com>
Signed-off-by: munishchouhan <hrma017@gmail.com>
Signed-off-by: munishchouhan <hrma017@gmail.com>
@munishchouhan
Copy link
Copy Markdown
Member

Tested locally:
config

  tokens:
    cache:
      duration: '10s'
      check-interval: '5s'
      max-duration: '60s'
    watcher:
      interval: '5s'
      delay: '2s'

workflow
Screenshot 2026-04-08 at 16 11 49

wave logs

16:08:52.547 [scheduled-executor-thread-17] DEBUG i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' does not require refresh - deadline=2026-04-08T14:08:56.056349Z; expiration=2026-04-08T14:09:01.056349Z >> wt=
16:08:52.547 [scheduled-executor-thread-17] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:01.056349Z) - event ts=2026-04-08T14:08:57.547582Z >> wt=
16:08:57.549 [scheduled-executor-thread-16] INFO  i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' expiration is extended by: PT13.507134S; at: 2026-04-08T14:09:11.056349Z; (was: 2026-04-08T14:09:01.056349Z) >> wt=
16:08:57.550 [scheduled-executor-thread-16] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:11.056349Z) - event ts=2026-04-08T14:09:02.550394Z >> wt=
16:09:02.549 [scheduled-executor-thread-20] DEBUG i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' does not require refresh - deadline=2026-04-08T14:09:06.056349Z; expiration=2026-04-08T14:09:11.056349Z >> wt=
16:09:02.549 [scheduled-executor-thread-20] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:11.056349Z) - event ts=2026-04-08T14:09:07.549954Z >> wt=
16:09:07.545 [scheduled-executor-thread-19] INFO  i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' expiration is extended by: PT13.511018S; at: 2026-04-08T14:09:21.056349Z; (was: 2026-04-08T14:09:11.056349Z) >> wt=
16:09:07.546 [scheduled-executor-thread-19] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:21.056349Z) - event ts=2026-04-08T14:09:12.546107Z >> wt=
16:09:12.550 [scheduled-executor-thread-4] DEBUG i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' does not require refresh - deadline=2026-04-08T14:09:16.056349Z; expiration=2026-04-08T14:09:21.056349Z >> wt=
16:09:12.550 [scheduled-executor-thread-4] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:21.056349Z) - event ts=2026-04-08T14:09:17.550524Z >> wt=
16:09:17.550 [scheduled-executor-thread-5] INFO  i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' expiration is extended by: PT13.508901S; at: 2026-04-08T14:09:31.056349Z; (was: 2026-04-08T14:09:21.056349Z) >> wt=
16:09:17.550 [scheduled-executor-thread-5] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:31.056349Z) - event ts=2026-04-08T14:09:22.550888Z >> wt=
16:09:22.550 [scheduled-executor-thread-12] DEBUG i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' does not require refresh - deadline=2026-04-08T14:09:26.056349Z; expiration=2026-04-08T14:09:31.056349Z >> wt=
16:09:22.550 [scheduled-executor-thread-12] TRACE i.s.w.s.r.ContainerRequestServiceImpl - Scheduling container request io.seqera.wave.service.request.ContainerRequestRange$Entry(07e49374f8ba, 1FMdRb3swdoqTh, 2026-04-08T14:09:31.056349Z) - event ts=2026-04-08T14:09:27.550450Z >> wt=
16:09:27.552 [scheduled-executor-thread-7] INFO  i.s.w.s.r.ContainerRequestServiceImpl - Container request '07e49374f8ba' reached max allowed duration - expiration=2026-04-08T14:09:31.056349Z; new expiration=2026-04-08T14:09:41.056349Z; workflow=1FMdRb3swdoqTh >> wt=

@munishchouhan
Copy link
Copy Markdown
Member

@pditommaso should we merge this pr?

@pditommaso
Copy link
Copy Markdown
Collaborator Author

let me review

@munishchouhan
Copy link
Copy Markdown
Member

@pditommaso please review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants