Convert scheduler tests to use the new scheduler by AlexJones0 · Pull Request #136 · lowRISC/dvsim

AlexJones0 · 2026-04-01T15:25:19Z

Note: this PR is currently a draft as it depends on #135 which has not yet been merged; the first 3 commits are from that PR and can be safely ignored, only the last 4 commits are relevant. It is otherwise ready to review.

This PR is the fourteenth of a series of PRs to rewrite DVSim's core scheduling functionality (Scheduler, status display, launchers / runtime backends) to use an async design, with key goals of long term maintainability and extensibility.

This PR converts the scheduler tests to use the new async scheduler so that we can test the new design. As CI shows, the scheduler now passes all of the existing scheduler tests (plus, a couple of new tests, added in this PR). The intention is not to merge this PR yet - it should probably wait until all the other scheduler integration is completed and merged, and should only be merged right before making the switch between the old and new scheduler. Instead, this PR is intended to show the correctness and functionality of the new scheduler via the existing tests.

See the commit messages for more information.

See the explanatory comments added to JobStatus. The intention is that the new async scheduler will distinguish between jobs that are blocked due to unfinished dependencies (`SCHEDULED`), and those that are pending because there is no availability to run them, despite their dependencies being fulfilled (`QUEUED`). This new state is currently unused. Also add a short test to prevent potential future bugs from status shorthand name collisions. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

This field will be used to inform the new scheduler of which backend it should use to execute a job. Though the plumbing is not there in the rest of DVSim, the intent is to make the scheduler such that it could feasibly be run with multiple backends (e.g. some jobs faked, some jobs on the local machine, some dispatched to various remote clusters). To support this design, each job spec can now specify that it should be run on a certain backend, with some designated string name. To instead just use the configured default backend (which is the current behaviour, as the current scheduler only supports one backend / `launcher_cls`), this can be set to `None`. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

For now, this is separated in `async_core.py` - the intention is that it will eventually replace the scheduler in `core.py` when all necessary components for it to work are integrated. This commit contains the fully async scheduler design. Some notes: - Everything is now async. The scheduler is no longer tied to a Timer object, nor does it have to manage its print interval and poll frequency. It takes advantage of parallelism via cooperative multitasking as much as possible. - The scheduler is designed to support multiple different backends (new async versions of launchers). Jobs are dispatch according to their specifications and scheduler parameters. - The scheduler implements the Observer pattern for various events (start, end, job status change, kill signal), allowing consumers that want to use this functionality (e.g. instrumentation, status printer) to hook into the scheduler, instead of unnecessarily coupling code. - The previous scheduler only recognized killed jobs when they were reached in the queue and their status was updated. The new design immediately transitively updates jobs to instantly reflect status updates of all jobs when information is known. - Since the scheduler knows _why_ it is killing the jobs, we attach JobStatusInfo information to give more info in the failure buckets. - The job DAG is indexed and validated during initialization; dependency cycles are detected and cause an error to be raised. - Job info is encapsulated by records, keeping state centralized (outside of indexes). - The scheduler now accepts a prioritization function. It schedules jobs in a heap and schedules according to highest priority. Default prioritisation is by weights, but this can be customized. - The scheduler now has its own separate modifiable parallelism limit. - The scheduler has it sown separate modifiable parallelism limit separate from each individual backend's parallelism limit. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

The new scheduler uses an async model, so it's helpful for testing to pull in the asyncio pytest plugin. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

machshev

LGTM! Thanks @AlexJones0

To be sure Nix users can pull in the new Python dependency. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

This commit performs the changes necessary to port the scheduler tests to use the new async scheduler. This involves: - Creating a Mock RuntimeBackend. For now, to keep changes minimal and simple, we just use the LauncherAdapter with the MockLauncher. In the future it would be nice to make a mock RuntimeBackend as well though. - Mark all the tests as being asyncio with async def and use the new scheduler interface. - Update a couple of tests that were weirdly constructed (e.g. in terms of targets/ordering) due to constraints of the old scheduler. With these changes, _all_ scheduler tests are now passing with the new async scheduler across multiple iterations. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

Extend an existing test case for launcher / runtime backend parallelism to be able to also consider global scheduler-level parallelism. Introduce a new test to check that we can provide a custom prioritization function, and that jobs are indeed scheduled according to the priorities assigned by it if not blocked. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

AlexJones0 mentioned this pull request Apr 1, 2026

Add new async scheduler #135

Open

AlexJones0 added 4 commits April 1, 2026 17:18

test: add pytest-asyncio dependency

4e28235

The new scheduler uses an async model, so it's helpful for testing to pull in the asyncio pytest plugin. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

AlexJones0 force-pushed the async_scheduler_tests branch from a5e10db to 93d1b87 Compare April 1, 2026 16:18

machshev approved these changes Apr 1, 2026

View reviewed changes

AlexJones0 force-pushed the async_scheduler_tests branch from 93d1b87 to 1a9686d Compare April 1, 2026 18:08

AlexJones0 added 3 commits April 1, 2026 19:09

chore: update the Nix flake

3774e10

To be sure Nix users can pull in the new Python dependency. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

AlexJones0 force-pushed the async_scheduler_tests branch from 1a9686d to ccf3abe Compare April 1, 2026 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert scheduler tests to use the new scheduler#136

Convert scheduler tests to use the new scheduler#136
AlexJones0 wants to merge 7 commits intolowRISC:masterfrom
AlexJones0:async_scheduler_tests

AlexJones0 commented Apr 1, 2026 •

edited

Loading

Uh oh!

machshev left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AlexJones0 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

machshev left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AlexJones0 commented Apr 1, 2026 •

edited

Loading