Fix aspire run graceful shtudown behavior and unify with aspire start/aspire stop.#17814
Conversation
On Windows, the TypeScript guest apphost runs under tsx, which swallows Ctrl+Break. Our graceful-shutdown paths never succeed and the process is eventually force-terminated without a chance to drain DCP-managed resources. This change introduces a unified termination ladder (cooperative cancel -> graceful expire -> final drain -> force terminate) shared by every aspire run child process, and routes guest-apphost shutdown through DCP stop-process-tree on Windows instead of relying on Ctrl+Break propagation. Key components: - ConsoleCancellationManager: process-level signal handler with a three-stage escalation (Ctrl+C, second Ctrl+C, third Ctrl+C / drain timeout). RequestShutdown lets internal teardown drive the same ladder without consuming a user signal. - GracefulShutdownService: shared graceful-window token, expired by the cancellation manager when the user's grace period elapses or escalates. - IsolatedConsoleSpawner / IsolatedProcess: console-isolated child spawn used for aspire run, so Ctrl+C in the CLI does not propagate uncontrolled into the apphost system. - WindowsConsoleProcessJob: Windows-only job-object safety net (KILL_ON_JOB_CLOSE + BREAKAWAY_OK so DCP can fork its own descendants). - ProcessTerminator + ProcessLifetimeAdapter + ProcessPump: shared shutdown primitives consumed by AppHostServerSession and ProcessGuestLauncher. - DetachedAppHostShutdownService (renamed from ProcessShutdownService) now implements IProcessTreeGracefulShutdownSignaler, the abstraction that AppHostServerSession + ProcessGuestLauncher call to ask DCP to drive graceful resource teardown. aspire stop is left untouched - the new machinery is scoped to the run path. Short-lived RPC sessions (publish, sdk generate, sdk dump, scaffolding) keep the inherited-console default so Ctrl+C continues to exit immediately. Also fixed three issues caught in pre-PR review: - ConsoleCancellationManager separated _ladderStarted (CAS gate) from _userSignalCount so internal RequestShutdown no longer makes the user's first Ctrl+C collapse the graceful window. - GuestAppHostProject backchannel-fault continuation now uses CompareExchange on runEnded to race safely with the finally block. - ProcessTerminator.ShutdownAsync(int pid, ...) is now async so the underlying Process handle outlives the inner WaitForExitAsync. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17814Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17814" |
The original PR fixed graceful shutdown for the TypeScript guest path. Mac smoke testing surfaced that pure .NET AppHosts (the direct-launch path through DotNetAppHostProject.RunAsync -> ProcessExecutionFactory -> ProcessExecution) bypassed the new ladder entirely because their ProcessInvocationOptions don't opt into it. On Unix this collapsed graceful shutdown to microseconds (the OCE catch in ProcessExecution.WaitForExitAsync passed the already-cancelled token as the graceful budget to ProcessTerminator); on Windows graceful was skipped outright. Changes: - Extract the four-phase ladder into a shared ProcessGracefulShutdownLadder so AppHostServerSession, ProcessGuestLauncher, and the new IsolatedProcessExecution share the same escalation shape. - Extend ProcessInvocationOptions with four opt-in fields (IsolateConsole, ConsoleProcessJob, GracefulShutdownSignaler, ShutdownService). All default null/false so the 11 non-Run callers (build, restore, package add, layout, etc.) preserve their existing ProcessTerminator force-terminate behavior. - Add IsolatedProcessExecution to host an IsolatedProcess inside the existing IProcessExecution shape so the direct-launch path can opt into console isolation on Windows. - Branch ProcessExecutionFactory on IsolateConsole. The Windows-only console isolation requires WindowsConsoleProcessJob to be wired through; fail fast if it isn't (defense-in-depth against the silent-fallthrough footgun the isolated path is meant to prevent). - Make DotNetAppHostProject's ctor require GracefulShutdownService and IProcessTreeGracefulShutdownSignaler. ConsoleProcessJob stays optional (Windows-only DI registration). Only the run-path runOptions populate the four new fields; build/restore/publish callsites leave them unset. - Forward the four new fields through DotNetCliRunner.CreateInstrumentedProcessOptions. Without this, the wiring at DotNetAppHostProject.RunAsync line 318 silently reverts when piped through CLI instrumentation -- the smoke run on Mac initially appeared green only because DCP did the work the CLI ladder thought it was doing. Verified on macOS via a throwaway smoke harness: - Single SIGINT against a graceful-respecting AppHost: clean exit in <1s. - Single SIGINT against an AppHost stalling 15s in ApplicationStopping: process-tree escalation fires at exactly T+5s (matches the configured graceful budget), all descendants gone, exit 0. - Double SIGINT (1s apart) against the same stalled AppHost: graceful window collapses to T+1s, AppHost forcibly terminated, exit 0. Documented tradeoff: an abrupt termination leaves DCP unable to SIGKILL its TERM-ignoring children, so the slow-script orphan in this scenario matches existing semantics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
The isolated Windows spawn path uses CreateProcessW + Process.GetProcessById
to obtain a managed Process for the child. Process objects derived from
GetProcessById cannot reliably surface ExitCode on Windows — the BCL state
machine depends on the Process instance itself having started the child,
and throws InvalidOperationException("Process was not started by this
object") otherwise (see dotnet/runtime#45003).
This silently broke 'aspire run' on Windows on every code path that reads
the AppHost server / guest exit code: AppHostServerSession.Exited handler,
AppHostServerSession's post-Run HasExited probe, the backchannel
SocketException filter in GuestAppHostProject, IsolatedProcessExecution,
and ProcessGuestLauncher's telemetry / return value.
Fix it by holding the original CreateProcess HANDLE in a SafeProcessHandle
and exposing it via overrides on IsolatedProcess.ExitCode / HasExited that
call GetExitCodeProcess + WaitForSingleObject(0) directly. The
WaitForSingleObject step disambiguates the 'process exited with code 259'
case from the 'still running' STILL_ACTIVE sentinel.
Plumb the overrides through:
- AppHostServerRunResult gains ExitCodeOverride / HasExitedOverride
callbacks plus ReadExitCode() / ReadHasExited() accessors. Non-isolated
callers leave the overrides null and the accessors fall back to
Process.ExitCode / Process.HasExited, which work correctly for processes
the BCL itself started.
- DotNetBasedAppHostServerProject and PrebuiltAppHostServer's isolated
branches wire the overrides to the IsolatedProcess wrapper.
- AppHostServerSession captures the readers into private fields and routes
every status check through them. New TryGetServerExitCode() / existing
HasServerExited accessors are the recommended public API.
- ProcessGuestLauncher captures readExitCode / readHasExited locals on both
the isolated and inherited branches and uses them for telemetry +
return-value reads.
- GuestAppHostProject's StartBackchannelConnectionAsync now takes an
AppHostServerSession and uses HasServerExited / TryGetServerExitCode in
the SocketException filter instead of the raw Process.
The failing IsolatedProcessTests.Start_EchoesLine test that originally
exposed this on Windows CI now passes via the IsolatedProcess override
path; production paths that previously consumed Process.ExitCode directly
on the isolated spawn now go through the wrapper-backed readers.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Three stages of Ctrl+C seems like overkill. You're going to get some merge conflicts from #17652 |
DetachedProcessLauncher.StartWindows points both Stdout and Stderr at the same NUL handle (child writes go nowhere). When the unified WindowsProcessInterop.SpawnConsoleIsolatedProcess started building the PROC_THREAD_ATTRIBUTE_HANDLE_LIST from the (Stdin, Stdout, Stderr) slots, that NUL handle appeared twice. PROC_THREAD_ATTRIBUTE_HANDLE_LIST does not allow duplicate handle entries — CreateProcessW returns ERROR_INVALID_PARAMETER (87), which surfaced as 'aspire start' failing immediately on Windows with exit code 2 in the dogfood starter validation job. See https://devblogs.microsoft.com/oldnewthing/20111216-00/?p=8873 for the documented Win32 constraint. The old standalone DetachedProcessLauncher.StartWindows whitelisted only the single NUL handle (one entry), so this regressed when the launcher was switched over to the unified helper. Dedupe by value before populating the attribute list, and add a Windows- only regression test that spawns a detached process via the production DetachedProcessLauncher path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ah, for some reason I thought that had already been merged; I was playing around with the idea of a second cancellation token to communicate the graceful shutdown timeout, but it's not strictly necessary since we still have disposable process closure, so I'll likely back that part out. Goal is to have it work with your two stage model, but actually make our Windows child processes able to respond to Ctrl+C as well. |
…e-driven cleanup Internal failure paths in GuestAppHostProject.RunAsync (backchannel fault, dependency install failure, guest non-zero exit, normal completion) no longer route through ConsoleCancellationManager. They return their exit code directly and rely on the existing 'await using' disposables on the AppHost session and guest launcher to force-kill children as the run scope unwinds. DCP runs in a separate process tree and handles its own resource cleanup. The dedicated mutator API (CCM.RequestShutdown / CCM.RequestedExitCode) and the ladder-started/user-signal-count split it required are removed. Cancel is now a single 3-stage signal counter (cancel + watcher, expire graceful, force exit) driven exclusively by user signals from PosixSignalRegistration / Console.CancelKeyPress. The backchannel-fault continuation cancels a local CTS linked to the outer cancellation token, surfacing FailedToDotnetRunAppHost via a local 'internalFaultCode' captured by the outer OCE catch. RunCommand and Program.Main OCE catches now unconditionally map cancellation to CliExitCodes.Cancelled / Success — the only thing that reaches them is user-initiated cancellation, since internal failures return their exit code directly without throwing OCE. Net: -193 lines / +88 lines, with one new regression test removed (the four tests covering RequestShutdown semantics are gone; the FirstSignal test drops its RequestedExitCode assertions). Behavior preserved for user Ctrl+C: still gets the full 5s graceful + 5s drain ladder. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…shutdown # Conflicts: # src/Aspire.Cli/Commands/RunCommand.cs # src/Aspire.Cli/ConsoleCancellationManager.cs # src/Aspire.Cli/Projects/IAppHostServerSession.cs
…shutdown # Conflicts: # src/Aspire.Cli/Projects/GuestAppHostProject.cs # src/Aspire.Cli/Projects/GuestRuntime.cs # tests/Aspire.Cli.Tests/CliBootstrapTests.cs # tests/Aspire.Cli.Tests/Telemetry/TelemetryConfigurationTests.cs
…shutdown # Conflicts: # src/Aspire.Cli/Projects/AppHostServerSession.cs # src/Aspire.Cli/Projects/GuestAppHostProject.cs # tests/Aspire.Cli.Tests/Projects/AppHostServerSessionTests.cs
There was a problem hiding this comment.
Pull request overview
This PR introduces a unified graceful shutdown ladder for aspire run child processes on Windows (and strengthens Unix behavior). Previously, Ctrl+C propagation relied on OS-level mechanisms like Ctrl+Break, which TypeScript guests under tsx swallow, leading to force-kills without proper resource drain. The new architecture replaces ad-hoc per-component teardown with a shared four-phase escalation: cooperative cancel → graceful signal (via DCP stop-process-tree) → bounded wait → force tree-kill → drain.
Changes:
- New shutdown infrastructure:
GracefulShutdownService(central CTS facade),ProcessGracefulShutdownLadder(four-phase static helper),IProcessTreeGracefulShutdownSignalerinterface,ConsoleCancellationManagerrefactored to three-stage user-signal ladder with configurable graceful budget. - Console-isolated child spawn:
IsolatedProcess(cross-platform,CREATE_NEW_CONSOLE+ anonymous pipes on Windows),IsolatedConsoleSpawner,WindowsConsoleProcessJob(kill-on-close + breakaway job object),IsolatedProcessExecution(IProcessExecution wrapper for the direct-launch path). Shared P/Invoke declarations consolidated inWindowsProcessInterop. - Session and launcher refactors:
AppHostServerSessionreplaced static factory with constructor+StartAsync()pattern (deletingIAppHostServerSession/IAppHostServerSessionFactoryinterfaces);ProcessGuestLauncherandGuestAppHostProjectwired throughGuestLaunchOptions;DotNetAppHostProjectextended with optional shutdown fields for the direct-launch path.
Note: The PR description references design identifiers (RequestShutdown, _ladderStarted, _userSignalCount, RequestShutdown_ThenSingleUserCancel_DoesNotCollapseGracefulWindow) that don't appear in the final code. The implementation evolved after the description was written.
Reviewed changes
Copilot reviewed 55 out of 55 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/Aspire.Cli/ConsoleCancellationManager.cs |
Refactored to three-stage user signal ladder with GracefulShutdownService dependency and configurable graceful budget |
src/Aspire.Cli/GracefulShutdownService.cs |
New — thin CTS facade for the central graceful-shutdown token |
src/Aspire.Cli/Processes/ProcessGracefulShutdownLadder.cs |
New — shared four-phase shutdown escalation (signal → wait → kill → drain) |
src/Aspire.Cli/Processes/IProcessTreeGracefulShutdownSignaler.cs |
New — interface for graceful process-tree shutdown signaling |
src/Aspire.Cli/Processes/DetachedAppHostShutdownService.cs |
Renamed from ProcessShutdownService, implements IProcessTreeGracefulShutdownSignaler |
src/Aspire.Cli/Processes/WindowsProcessInterop.cs |
New — consolidated P/Invoke declarations, BuildCommandLine, BuildEnvironmentBlock, SpawnConsoleIsolatedProcess |
src/Aspire.Cli/Processes/WindowsConsoleProcessJob.cs |
New — Windows job object wrapper (KILL_ON_JOB_CLOSE + BREAKAWAY_OK) |
src/Aspire.Cli/Processes/IsolatedProcess.cs |
New — cross-platform spawn abstraction with pumps and exit-code overrides |
src/Aspire.Cli/Processes/IsolatedProcess.Windows.cs |
New — Windows CreateProcessW with CREATE_NEW_CONSOLE + anonymous pipes |
src/Aspire.Cli/Processes/IsolatedProcess.Unix.cs |
New — Unix thin Process.Start wrapper |
src/Aspire.Cli/Processes/IsolatedConsoleSpawner.cs |
New — ProcessStartInfo → IsolatedProcessStartInfo translation |
src/Aspire.Cli/Processes/ProcessTerminator.cs |
Made ShutdownAsync(int pid, ...) properly async; shared termination primitives |
src/Aspire.Cli/Processes/ProcessPump.cs |
New — line-by-line pipe pump with exception recording |
src/Aspire.Cli/Processes/ProcessLifetimeAdapter.cs |
New — IAsyncDisposable wrapper for Process |
src/Aspire.Cli/Processes/DetachedProcessLauncher.Windows.cs |
Refactored to use shared WindowsProcessInterop |
src/Aspire.Cli/Projects/ProcessGuestLauncher.cs |
Major refactor: two spawn paths (isolated/inherited), GuestLaunchOptions, shared ladder |
src/Aspire.Cli/Projects/IGuestProcessLauncher.cs |
New GuestLaunchOptions record with optional shutdown services |
src/Aspire.Cli/Projects/AppHostServerSession.cs |
Refactored: constructor + StartAsync() replacing static factory; owns process lifetime |
src/Aspire.Cli/Projects/IAppHostServerSession.cs |
Deleted — interface removed |
src/Aspire.Cli/Projects/IAppHostServerProject.cs |
New AppHostServerRunResult record; Run method updated with isolation params |
src/Aspire.Cli/Projects/DotNetBasedAppHostServerProject.cs |
Updated Run signature |
src/Aspire.Cli/Projects/PrebuiltAppHostServer.cs |
Updated Run signature |
src/Aspire.Cli/Projects/DotNetAppHostProject.cs |
New optional shutdown fields; run path opts into ladder |
src/Aspire.Cli/Projects/GuestAppHostProject.cs |
Wires shutdown services; CAS-guarded backchannel fault escalation |
src/Aspire.Cli/Projects/GuestRuntime.cs |
Forwards GuestLaunchOptions |
src/Aspire.Cli/Projects/ExtensionGuestLauncher.cs |
Ignores options (intentional — extension hosts have their own lifecycle) |
src/Aspire.Cli/Program.cs |
DI wiring for new services; OCE catch in Main reads CCM exit code |
src/Aspire.Cli/Commands/RunCommand.cs |
Configures 5s graceful budget at command entry |
src/Aspire.Cli/DotNet/ProcessExecutionFactory.cs |
New isolated execution path via IsolatedProcessExecution |
src/Aspire.Cli/DotNet/ProcessExecution.cs |
OCE catch routes to ProcessGracefulShutdownLadder when signaler+service present |
src/Aspire.Cli/DotNet/IsolatedProcessExecution.cs |
New — IProcessExecution wrapper for IsolatedProcess |
src/Aspire.Cli/DotNet/DotNetCliRunner.cs |
CreateInstrumentedProcessOptions forwards all 4 new opt-in fields |
tests/Aspire.Cli.Tests/ConsoleCancellationManagerTests.cs |
Updated for three-stage ladder; new budget/drain tests |
tests/Aspire.Cli.Tests/GracefulShutdownServiceTests.cs |
New — token/expire/dispose coverage |
tests/Aspire.Cli.Tests/Projects/AppHostServerSessionTests.cs |
Significant new coverage for session lifecycle and graceful shutdown |
tests/Aspire.Cli.Tests/Projects/ProcessGuestLauncherTests.cs |
New — four test scenarios: force-kill, graceful success, escalation, signaler fault |
tests/Aspire.Cli.Tests/Projects/GuestAppHostProjectTests.cs |
Updated construction with shutdown services |
tests/Aspire.Cli.Tests/Processes/IsolatedProcessTests.cs |
New — spawn, pump, exit-code, kill coverage |
tests/Aspire.Cli.Tests/Processes/WindowsConsoleProcessJobTests.cs |
New — job object lifecycle tests |
tests/Aspire.Cli.Tests/Processes/DetachedProcessLauncherTests.cs |
New — detached launch tests |
tests/Aspire.Cli.Tests/Processes/DetachedAppHostShutdownServiceTests.cs |
Renamed from ProcessShutdownServiceTests |
tests/Aspire.Cli.Tests/Utils/CliTestHelper.cs |
DI registrations updated to mirror production wiring |
tests/Aspire.Cli.Tests/TestServices/TestAppHostServerSessionFactory.cs |
Deleted — no longer needed after interface removal |
tests/Aspire.Cli.Tests/TestServices/FakeFailingAppHostServerProject.cs |
Updated Run signature |
tests/Aspire.Cli.Tests/Scaffolding/ChannelReseedTests.cs |
Updated Run signature in test double |
tests/Aspire.Cli.Tests/Telemetry/TelemetryConfigurationTests.cs |
Updated CCM construction |
The merge commit captured an unstaged version of GuestAppHostProject.cs that still referenced main's old 'appHostServerProcess' variable and kept a duplicate eager backchannel start. Adopt main's deferred backchannel pattern for the publish path: remove the eager start and use serverSession in the after-launch callback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
- IsolatedConsoleSpawner: replace overlay-without-clear with full Clear()+ copy so callers that did startInfo.Environment.Remove(key) see the removal honored on the isolated path. Without this, a caller like PrebuiltAppHostServer.CreateStartInfo that defensively removes IntegrationLibsPath/IntegrationProbeManifestPath would silently re- inherit those vars from the parent CLI's env block. Matches the Unix partial's Environment.Clear()+add semantics. - AppHostServerSession.StartAsync: move the failed-lifetime DisposeAsync out of the lock(_startGate) block. The catch path was running a sync- over-async disposal while holding the start gate, which pinned the gate for whatever IsolatedProcess.DisposeAsync took (pipe pump drain etc.) and queued any concurrent DisposeAsync behind that wait. Capture the failed lifetime and dispose after lock release; rethrow via ExceptionDispatchInfo. - ProcessGracefulShutdownLadder: drop the TryGetStartTime() helper. The sole call site hardcodes includeStartTimeForDcp: false, so the value was always computed and discarded. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
❓ CLI E2E Tests unknown — 113 passed, 0 failed, 2 unknown (commit View all recordings
📹 Recordings uploaded automatically from CI run #27316663922 |
…shutdown # Conflicts: # src/Aspire.Cli/Projects/DotNetAppHostProject.cs # src/Aspire.Cli/Projects/GuestAppHostProject.cs # src/Aspire.Cli/Projects/IAppHostServerSession.cs # tests/Aspire.Cli.Tests/Projects/AppHostServerSessionTests.cs # tests/Aspire.Cli.Tests/TestServices/TestAppHostServerSessionFactory.cs
…nto danegsta/windows-tsx-shutdown
adamint
left a comment
There was a problem hiding this comment.
(automated review)
I think there are a few shutdown edge cases that need to be fixed before this can merge:
-
StartBackchannelConnectionAsynccatchesOperationCanceledExceptionfromConnectAsync/Task.Delaywhen Ctrl+C cancelsappHostSystemToken, then faults the backchannel TCS. That continuation setsinternalFaultCode = FailedToDotnetRunAppHost, so the outer cancellation path returns a run failure instead of the normal cancelled exit code. Could you handle cancellation separately and avoid turning user cancellation into a backchannel failure? -
ProcessTreeGracefulShutdownService.ForceKillRemainingProcessesuseskillEntireProcessTree = !OperatingSystem.IsWindows(). Ifaspire stop's graceful DCP stop fails or times out on Windows, the fallback only kills the root AppHost/CLI process and can leave descendants like DCP/tsx/node running. Once graceful shutdown has failed, I think the escalation should tree-kill on Windows too. -
AppHostServerSession.DisposeAsyncdisposes_startGate, but a second dispose calls_startGate.WaitAsync()before checking_disposed, which will throw after the first disposal. Can we make the idempotent check happen before touching the semaphore, or just not dispose the semaphore?
DisposeAsync is not invoked by the finalizer, so update the stale 'let the GC handle cleanup' note to explain that the Process native handles are reclaimed by SafeHandle finalizers instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ProcessExecutionFactory's replace path called Environment.Clear(), which lazily seeded ~50-100 parent env entries only to discard them immediately. Add IsolatedProcessStartInfo.UseEmptyEnvironment() to initialize the env block empty (skipping the parent snapshot) and mark HasCustomEnvironment, and route the replace path through it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 'if (Debugger.IsAttached) return;' early exits with Assert.Skip so the tests surface as skipped (not silently passed) when their CancelAfter-based timing is disabled under a debugger. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The run and publish paths each slept a fixed 500ms after starting the AppHost server, then checked serverCompletion.IsCompleted to detect an early crash. This was both wasteful (every healthy startup paid the full delay) and fragile (a slow-spawning server could still race the check). The delay+precheck is redundant: GetRpcClientAsync already races the RPC connect against the server-exit signal and surfaces an early crash immediately, and the surrounding catch blocks display the same output + "App host exited unexpectedly." error and return the same exit code. On the publish path the catch additionally faults the backchannel waiter, which the precheck never did, so relying on it is strictly better. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ProcessGuestLauncher constructor previously defaulted fileLoggerProvider, commandResolver, and processExecutionFactory purely to avoid updating call sites. Make them required and push the defaults (the NullLogger-backed execution factory and the PATH command resolver) to the callers that actually own those choices. Likewise, the GuestLaunchOptions argument added to IGuestProcessLauncher.LaunchAsync in this PR (and the pre-existing afterLaunchAsync) were optional only for call-site brevity. Make them non-optional and move cancellationToken to the end, matching the convention used elsewhere in the PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Audit pass over arguments added in this PR: make a new optional param required when no caller relies on its default. - IAppHostServerProject.RunAsync: env/additionalArgs/debug/runControl all required (sole caller passes them); update doc + 2 impls + 2 fakes. - GuestAppHostProject.ExecuteGuestAppHostAsync: appHostLaunchOptions required and cancellationToken moved last (sole caller always passes it). - IsolatedProcess.Kill(bool): drop the default to match the non-optional sibling ProcessExecution.Kill(bool) (zero callers relied on it). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…spose idempotency Address three review findings: - GuestAppHostProject: Ctrl+C during backchannel connect faulted the completion source, tripping the IsFaulted continuation that sets FailedToDotnetRunAppHost. Guard cancellation so user cancellation surfaces as Cancelled (130) instead of a run failure. - ProcessTreeGracefulShutdownService: the force-kill fallback only tree- killed on Unix. Once graceful shutdown has failed/timed out, tree-kill on Windows too so orphaned descendants (DCP, tsx/node, guest) are not left running. The success-path mop-up still avoids tree-kill. - AppHostServerSession: DisposeAsync disposed _startGate, so a second DisposeAsync threw ObjectDisposedException from WaitAsync before the idempotency guard could observe _disposed. Stop disposing the semaphore (AvailableWaitHandle is never used). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…shutdown # Conflicts: # src/Aspire.Cli/Commands/BaseCommand.cs # src/Aspire.Cli/Projects/GuestAppHostProject.cs # src/Aspire.Cli/Projects/GuestRuntime.cs # src/Aspire.Cli/Projects/ProcessGuestLauncher.cs # tests/Aspire.Cli.Tests/Projects/GuestAppHostProjectTests.cs # tests/Aspire.Cli.Tests/Projects/GuestRuntimeTests.cs # tests/Aspire.Cli.Tests/Scaffolding/ChannelReseedTests.cs # tests/Aspire.Cli.Tests/Utils/CliTestHelper.cs
Tests selector (audit mode)The full test matrix and all jobs still run in audit mode. The tests and jobs below are what selective CI would run under enforcement. 42 / 97 test projects · 5 jobs, from 72 changed files. Selected test projects (42 / 97)
Selected jobs (5)
How these were chosen — grouped by what changed
🧪 show 40
📦 affected project 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 🧪 Job reasons
Selection computed for commit |
Pushed fixes for all these. |
PR Testing ReportPR Information
Artifact Version Verification
Changes AnalyzedFiles Changed
Change Categories
Test Scenarios ExecutedScenario 1: Dogfood artifact verificationObjective: Verify the tested CLI artifact contains the PR's latest changes. Steps:
Evidence:
Observations:
Scenario 2: .NET AppHost aspire run Ctrl+C graceful shutdownObjective: Validate a fresh .NET starter AppHost launched by aspire run exits gracefully when Ctrl+C is delivered to the CLI's console group. Steps:
Evidence:
Observations:
Scenario 3: TypeScript/polyglot AppHost aspire run Ctrl+C graceful shutdownObjective: Validate a fresh TypeScript + C# starter AppHost launched by aspire run exits gracefully after Ctrl+C, including the Vite/frontend resource path affected by the process-tree shutdown changes. Steps:
Evidence:
Observations:
Scenario 4: Detached aspire start / aspire stop lifecycleObjective: Validate detached AppHost launch and shutdown still work through the shared process execution and graceful shutdown path. Steps:
Evidence:
Observations:
Scenario 5: Standalone aspire dashboard run launch and Ctrl+C shutdownObjective: Validate dashboard run still launches through the updated process execution path and exits cleanly on Ctrl+C. Steps:
Evidence:
Observations:
Scenario 6: Invalid --apphost pathObjective: Verify a user-facing CLI command fails safely when given a non-existent AppHost path. Steps:
Evidence:
Expected Unhappy-Path Outcome: A clear validation error, non-zero exit code, and no orphaned processes. Observations:
Summary
Overall ResultPR VERIFIED The PR dogfood CLI at 13.5.0-pr.17814.ga221fa94 successfully handled Ctrl+C graceful shutdown for .NET, TypeScript/polyglot, and standalone dashboard-run scenarios on Windows, and detached aspire start / aspire stop remained functional. Artifacts
|
|
Also verified working on MacOS. |
- Expand aspire-run.mdx 'Stopping the AppHost' section with the full three-step shutdown ladder (cooperative cancellation → graceful wait → automatic force-kill) and clarify the second Ctrl+C behavior. - Add a Windows note about isolated console session for tsx/npm AppHosts. - Correct aspire-stop.mdx description: signal targets the AppHost process directly, not an intermediary CLI process. Source: microsoft/aspire#17814 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pull request created: #1291
|
Description
This PR unifies graceful shutdown for long-running
aspire runchild processes so AppHosts and guest/polyglot AppHosts follow one shutdown ladder instead of each owner having ad-hoc cancellation and teardown behavior.The shared ladder is:
The main motivation is Windows: TypeScript/JavaScript guest AppHosts often run through wrappers such as
tsx/npm.cmd, and relying on inherited console Ctrl+C/Ctrl+Break propagation is unreliable. The PR now launchesaspire runchildren in an isolated console on Windows and requests graceful shutdown through DCPstop-process-tree, which performs the AttachConsole +GenerateConsoleCtrlEvent(CTRL_C_EVENT)dance against the AppHost/process tree without also signaling the CLI.The final Windows issue found during smoke testing was inherited
CTRL+Csuppression. If the CLI is launched as a descendant of aCREATE_NEW_PROCESS_GROUProot, Windows propagates the process-level "ignore CTRL+C" attribute to child processes, causing DCP's generatedCTRL_C_EVENTto succeed but be silently dropped before it reaches the AppHost.Program.Mainnow callsSetConsoleCtrlHandler(NULL, FALSE)on Windows before registering signal handlers or launching children, so the AppHost and DCP-launched services inherit CTRL+C-enabled state.On macOS/Linux the same ladder is used, but the graceful signal is SIGTERM via
ProcessSignalerrather than DCP console-control events.aspire stopalso targets the AppHost PID directly with SIGTERM on Unix to avoid launcher/dotnet runprocess-tree races.Approach
The shutdown machinery is split into focused services:
ConsoleCancellationManager— process-level signal handler. First signal requests cooperative cancellation and starts the graceful/drain watcher; second signal expires the graceful window; third signal or drain timeout completes the final termination source. Handles SIGINT, SIGTERM, Windows SIGQUIT/CTRL_BREAK,Console.CancelKeyPress, andProcessExitthrough the same path.GracefulShutdownService— owns the shared graceful-window token consumed by process ladders.IsolatedConsoleSpawner/IsolatedProcess/IsolatedProcessExecution— WindowsCREATE_NEW_CONSOLEchild launch foraspire run, with stdout/stderr pumping and integration into the existingIProcessExecutionshape. Unix stays a thin process wrapper.WindowsConsoleProcessJob— Windows kill-on-close safety net for isolated children, withBREAKAWAY_OKso DCP can outlive the CLI long enough to perform cleanup.ProcessGracefulShutdownLadder— shared graceful signal -> bounded wait -> force tree-kill -> drain implementation used byAppHostServerSession,ProcessGuestLauncher, and direct .NET AppHost execution.DetachedAppHostShutdownService— renamed fromProcessShutdownService; implementsIProcessTreeGracefulShutdownSignaler. On Windows it shells out to DCPstop-process-tree; on Unix it sends SIGTERM.ProcessTerminator,ProcessLifetimeAdapter, andProcessPump— shared lower-level process lifetime helpers.AppHostServerSession,ProcessGuestLauncher, and the direct.NETAppHost path now all consume the same graceful ladder. Short-lived RPC-style paths (publish,sdk generate,sdk dump, scaffolding, build/restore/package-add calls) intentionally remain on the inherited-console/default cancellation behavior.Things worth a careful look
stop-process-treecan deliver the signal quickly but then block until the target exits. The ladder invokes the signaler in parallel withWaitForExitAsync(gracefulToken)so DCP's wait cannot consume the entire graceful budget before the child gets its own exit wait.SetConsoleCtrlHandler(NULL, FALSE)is called once at CLI startup to clear any inherited "ignore CTRL+C" state before child processes are spawned.DotNetAppHostProject.RunAsyncnow opts into the same graceful infrastructure viaProcessInvocationOptions(IsolateConsole,ConsoleProcessJob,GracefulShutdownSignaler,ShutdownService). Non-rundotnetinvocations leave those unset.GuestAppHostProject.RunAsyncuses an interlocked gate so the backchannel-fault continuation and normalfinallycleanup cannot both own shutdown.Kill(entireProcessTree: true)for the root process to avoid wrapper/orphan cases.Validation
Windows smoke testing with a local build of the CLI and default
aspire-starterAppHost:CREATE_NEW_CONSOLE) — Signaler attaches to the CLI console and sendsCTRL_C_EVENT; CLI exits gracefully in ~650-700ms, all AppHost sentinels (started,run-returned,finally) are written, and there are zero survivor descendants.CREATE_NEW_PROCESS_GROUPgrandparent regression — before theSetConsoleCtrlHandler(NULL, FALSE)fix, DCP fell back to its ~6s kill path and only thestartedsentinel was written. After the fix, the same case exits gracefully in ~650-700ms with all sentinels and zero survivors.Earlier macOS/Linux smoke testing confirmed the Unix SIGTERM path exercises the same ladder: graceful-respecting AppHosts exit cleanly, stalled AppHosts escalate at the graceful deadline, and double interrupt collapses the graceful window early.
Test impact
GracefulShutdownService,ConsoleCancellationManager,ProcessGracefulShutdownLadderbehavior throughProcessGuestLauncher,AppHostServerSession,ProcessExecution,IsolatedProcess,WindowsConsoleProcessJob, andDetachedAppHostShutdownService.ConsoleCancellationManagerTests,ProcessGuestLauncherTests,ProcessExecutionTests, andAppHostServerSessionTests— 45/45 passed.Fixes #17638
Checklist
<remarks />and<code />elements on your triple slash comments?