Skip to content

Add the local runtime backend#134

Open
AlexJones0 wants to merge 2 commits intolowRISC:masterfrom
AlexJones0:local_runtime_backend
Open

Add the local runtime backend#134
AlexJones0 wants to merge 2 commits intolowRISC:masterfrom
AlexJones0:local_runtime_backend

Conversation

@AlexJones0
Copy link
Copy Markdown
Contributor

@AlexJones0 AlexJones0 commented Apr 1, 2026

This PR is the tweflth of a series of PRs to rewrite DVSim's core scheduling functionality (Scheduler, status display, launchers / runtime backends) to use an async design, with key goals of long term maintainability and extensibility.

This PR contains the local implementation of the RuntimeBackend interface, intended to replace the LocalLauncher. This launcher executes jobs as subprocesses on the user's local machine. The base RuntimeBackend is also expanded with more backend-agnostic code that will be shared between most backends that are implemented and used.

See the commit messages for more information.

This commit fleshes out the abstract `RuntimeBackend` base class with a
lot of core functionality that will be needed to implement new runtime
backends (which aren't just the legacy launcher adapter). These mostly
take the form of protected methods optionally called by backends,
comprised of logic on the `Launcher` base class or that was previously
duplicated across its various subclasses.

Some key changes to note from the launchers:
- A new `DVSIM_RUN_INTERACTIVE` env var is introduced intended to
  replace the `RUN_INTERACTIVE` env var long term, to avoid potential
  name collision.
- Errors are raised if an interactive job tries to run on a backend that
  doesn't support running jobs interactively.
- Log parsing functionality is extracted to a separate object; logs
  are always lazily loaded so that for jobs that don't need them
  (passing jobs without any fail or pass patterns), we don't waste time.
- Efficiency of the log contents pass/fail regex pattern parsing is
  improved. Fail patterns are combined into a single regex check, and
  all regexes are compiled once instead of per-line.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
This is the async `RuntimeBackend` replacement of the `LocalLauncher`,
which will eventually by removed in lieu of this new backend.

Some behavioural differences to note:
- We now try to await() after a SIGKILL to be sure the process ended,
  bounded by a short timeout in case blocked at the kernel level.
- We now use psutil to enumerate and kill descendent processes in
  addition to the created subprocess. This won't catch orphaned
  processes (needs e.g. cgroups), but should cover most sane usage.
- The backend does _not_ link the output directories based on status
  (the `JobSpec.links`, e.g. "passing/", "failed/", "killed/"). The
  intention is that this detail is not core functionality for either
  the scheduler or the backends - instead, it will be implemented as
  an observer on the new async scheduler callbacks when introduced.

By using async subprocesses and launching/killing jobs in batch, we are
able to more efficiently launch jobs in parallel via async coroutines.
We likewise avoid the ned to poll jobs - instead we have an async task
awaiting the subprocess' completion, which we then forward to notify the
(to be added) scheduler of the job's completion.

Note that interactive jobs are still basically handled synchronously as
before - assumed that there is only 1 interactive job running at a time.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
@AlexJones0 AlexJones0 force-pushed the local_runtime_backend branch from a155626 to d14abc6 Compare April 1, 2026 16:08
@AlexJones0 AlexJones0 marked this pull request as ready for review April 1, 2026 16:08
Copy link
Copy Markdown
Collaborator

@machshev machshev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! @AlexJones0
Huge improvement...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants