Skip to content

Reduce sh postRun overhead by combining 3 shSync calls into 1#265

Merged
willr3 merged 1 commit intoHyperfoil:masterfrom
stalep:perf_improvements
Mar 23, 2026
Merged

Reduce sh postRun overhead by combining 3 shSync calls into 1#265
willr3 merged 1 commit intoHyperfoil:masterfrom
stalep:perf_improvements

Conversation

@stalep
Copy link
Copy Markdown
Member

@stalep stalep commented Mar 17, 2026

The postRun() method previously made 3 sequential shSync calls after every sh command (exit code capture, pwd, exit code restore), each blocked by the 100ms SuffixStream prompt detection delay, adding ~300ms of overhead per command.

Combine them into a single shSync call that captures exit code and pwd in one output delimited by ":::", reducing overhead to ~100ms (~3x faster). Fallback retry logic for noisy output is preserved.

Also adds benchmark tests to measure and track shell overhead.

Copy link
Copy Markdown
Collaborator

@willr3 willr3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think combining multiple shSync would be helpful, where possible, but this PR introduces several functional edge cases that I suspect would break current user functionality and introduce uncertainty into the execution state for user scripts.

context.getShell().isPromptShell(getPreviousPrompt()) &&
context.getShell().getHost().isShell())
context.getShell().getHost().isShell() &&
shouldCheckExit(context))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pwd is needed even if exit code is not being checked.

* With the default 100ms SuffixStream delay, this adds ~300ms+ per sh command.
* These tests quantify that overhead in realistic scenarios.
*/
public class ShCommandOverheadTest extends SshTestBase {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like another class that was useful when considering the change but I do not understand how it would be helpful to maintain this as part of the qDup test suite.

while (!parsed && retry < 5 && combined != null && !combined.isBlank() && context.getShell().isReady() && !context.isAborted()) {
combined = context.getShell().shSync("echo \"${__qdup_ec}:::$(pwd)\"");
if (combined != null && combined.contains(":::")) {
String[] parts = combined.trim().split(":::", 2);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using split makes the assumption that the deliminator is not part of the pwd output. That is likely true but using indexOf would avoid that potential issue.


// Warmup: discard first measurement to avoid SSH channel setup overhead
calibrating = true;
shSync("echo qdup_ping");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes echo is a valid command on the remote host. That is likely valid for the majority of systems but have automation where that is not correct. We have dealt with this in qDup in two ways:

  1. have a configuration option on the host definition to override the default command.
  2. Only run this command if host.isShell()=true and otherwise use a default value.

stalep added a commit to stalep/qDup that referenced this pull request Mar 18, 2026
- Always track pwd in postRun even when exit code checking is disabled,
  so queue-download relative paths work correctly in all contexts
- Revert lazy pwd fetch in QueueDownload to preserve watch/observer
  context compatibility (shSync cannot be called from observers)
- Use indexOf instead of split for parsing "exitCode:::pwd" delimiter
  to handle the unlikely case of ::: appearing in directory names
- Handle !parsed case explicitly by logging error and aborting to avoid
  running in an unknown execution state
- Remove benchmark test classes (ShCommandOverheadTest,
  ShellOverheadBenchmarkTest) as they are profiling tools not suitable
  for the main test suite
@stalep
Copy link
Copy Markdown
Member Author

stalep commented Mar 18, 2026

@willr3 I added another commit which addresses your comments I hope :)

@stalep stalep force-pushed the perf_improvements branch from 23452bd to c496ee4 Compare March 19, 2026 15:38
@stalep
Copy link
Copy Markdown
Member Author

stalep commented Mar 20, 2026

@willr3 what do you think about this?

Previously, after each sh command, postRun() made 3 sequential shSync
calls: exit code capture, pwd, and exit code restore. Each shSync blocks
until prompt detection fires, so this added 3 round-trips of overhead
per command.

Combine all three into a single command:
  export __qdup_ec=$?; echo "${__qdup_ec}:::$(pwd)"; (exit $__qdup_ec)

This captures exit code, pwd, and restores exit code in one round-trip.
The response is parsed using indexOf to handle the unlikely case of :::
appearing in directory names. If parsing fails (e.g. noisy output from
concurrent processes), retries re-echo the saved variable. If all
retries fail, the run is aborted with a clear error message rather than
continuing in an unknown state.
@stalep stalep force-pushed the perf_improvements branch from c496ee4 to ba97ba5 Compare March 20, 2026 08:22
@willr3 willr3 merged commit 0c5f680 into Hyperfoil:master Mar 23, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants