Skip to content

Port the status printers to use the new async scheduler#137

Draft
AlexJones0 wants to merge 5 commits intolowRISC:masterfrom
AlexJones0:async_status_printers
Draft

Port the status printers to use the new async scheduler#137
AlexJones0 wants to merge 5 commits intolowRISC:masterfrom
AlexJones0:async_status_printers

Conversation

@AlexJones0
Copy link
Copy Markdown
Contributor

Note: this PR is currently a draft as it depends on #135 which has not yet been merged; the first commit is from that PR and can be safely ignored, only the last 4 commits are relevant. It is otherwise ready to review.

This PR is the fifteenth of a series of PRs to rewrite DVSim's core scheduling functionality (Scheduler, status display, launchers / runtime backends) to use an async design, with key goals of long term maintainability and extensibility.

This PR implements ports of the existing StatusPrinter objects to be compatible with the new async scheduler interface, making some small improvements at the same time whilst rewriting them. This includes making print intervals work properly (i.e. independent of the scheduler, in a separate loop), adding the ability to print every status change (configured by setting the print interval to 0) fixing a bug in the EnlightenStatusPrinter, and making some implicit behaviour more explicit.

See the commit messages for more information.

See the explanatory comments added to JobStatus. The intention is that
the new async scheduler will distinguish between jobs that are blocked
due to unfinished dependencies (`SCHEDULED`), and those that are pending
because there is no availability to run them, despite their dependencies
being fulfilled (`QUEUED`). This new state is currently unused.

Also add a short test to prevent potential future bugs from status
shorthand name collisions.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add the new async `StatusPrinter` abstract base class, intended to
replace the original for use with the new async scheduler. The original
will not removed until the scheduler has been switched.

This is now an abstract base class, rather than an empty class used for
interactive sessions - since the status printer will live outside the
new scheduler, it becomes much easier to just _not connect_ any status
printer hooks during interactive mode.

Some notable changes and overhauls:
- Status printing now runs entirely independently of the scheduler. If
  a print interval > 0 is configured, then the status printer now runs
  as a loop with async awaits such that the timing logic is entirely
  separate from the scheduler, maintained by cooperative multitasking.
- As new functionality, if a print interval of 0 is configured, we
  instead activate in synchronous "event/update-driven mode" where every
  single status update is printed. This might be useful for e.g. the TTY
  printer where you may want to capture exact times of all updates.
- As a result of observing the scheduler, the status printer maintains
  its own stateful tracking of job information.
- Field alignment is calculated from the initial job information and
  data is appropriately justified to clean up the output tables.
- The ability to pause the status bar is introduced to help (later)
  deal with issues in the EnlightenStatusBar, where its terminal
  interactivity can be broken and cause hangs under heavy load.
- General refactoring: the status header and fields are no longer
  hardcoded and are instead derived from the JobStatus enum.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Extract time-related utilities (the `hms` functionality of the `Timer`
and the two timestamp formats from `fs.py`) into a new `time.py` utility
module. The intention is to use the `hms` functionality inside the
new async status printers and to eventually remove `timer.py` completely
when the old scheduler is removed.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
This commit ports the `TtyStatusPrinter` to use the new async interface,
extending the `StatusPrinter` interface introduced previously. The
extended logic remains mostly the same as the original code, with some
small refactors and tweaks for aesthetics.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
This commit ports the original `EnlightenStatusPrinter` to the new async
interface, extending the `StatusPrinter` abstract base class.

The logic is mostly the same, with a few important caveats to note:
- Since the new interface allows float (and hence sub-second) print
  intervals, a warning is introduced for intervals less than Enlighten's
  internal minimum delta value which will cause updates to be coalesced
  and potentially lost at points if not refreshed. We could also lower
  the `min_delta` to match the print interval, but experimentation shows
  that this introduces performance concerns and is best left as is.
- Because of the above, logic is added to refresh (flush) the status bar
  when a target is done, to ensure it locks the final time correctly.
- An occasional bug was encountered on using Ctrl-C to gracefully exit
  where Enlighten's `StatusBar.update` would hang indefinitely. This
  occurred during the terminal protocol used by the underlying Blessed
  library, which queried the terminal for its size and expected a
  response. Under heavy loads, particularly when a large number of
  processes are killed due to an exit signal, the terminal response
  might not be received, causing Blessed to hang on a `getch` call.
  To prevent this, the `pause` interface was introduced to the base
  `StatusPrinter` which is used for that purpose here.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant