diff --git a/AGENTS.md b/AGENTS.md index 6338b9a..8beddf3 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -50,7 +50,7 @@ 5. **Mocking Interactivity**: When testing commands that branch on `stdin.isatty()`, use the `is_stdin_tty` helper in `execution.py` and mock it via `mocker.patch("colab_cli.commands.execution.is_stdin_tty", return_value=...)`. This ensures tests don't hang in CI/agent environments. 6. **State Isolation**: Always patch the `colab_cli.common.state` singleton in tests to control session persistence and client behavior. Refer to `tests/conftest.py` for the standard global fixture. 7. **Fire-and-Forget Architecture**: The Colab CLI is a "fire-and-forget" tool. Avoid using background threads for long-running tasks within the main command flows. For persistent needs such as keep-alive, utilize detached background daemon processes (with PID tracking in the session state). -8. **Verify the Local Install**: A globally-installed `colab` may exist on `PATH` (e.g. at `~/.local/bin/colab`) and can shadow the project's editable install when `uv run` is invoked from outside the repo. ALWAYS run shell commands with the repo as the working directory (e.g. via the `workdir` parameter, never `cd && cmd`) so `uv run colab ...` resolves to `.venv/bin/colab`. Confirm with `which colab` and `uv run which colab` if a CLI test produces unexpected results (e.g. flag-not-recognized errors for flags you just added). +8. **Verify the Local Install**: A globally-installed `colab` may exist on `PATH` (e.g. at `~/.local/bin/colab`) and can shadow the project's editable install when `uv run` is invoked from outside the repo. ALWAYS run shell commands with the repo as the working directory (e.g. via the `workdir` parameter, never `cd && cmd`) so `uv run colab ...` resolves to `.venv/bin/colab`. Confirm with `which colab` and `uv run which colab` if a CLI test produces unexpected results (e.g. flag-not-recognized errors for flags you just added). **Shebang invocations always resolve via `$PATH`**, so a script like `#!/usr/bin/env -S colab run ...` will pick up the stale global tool even when the editable install is current — when testing shebang-based behavior after a code change, always run `uv tool install --reinstall --force --from . colab` first, then verify with `colab version` (the version string includes the git short SHA). Encoded 2026-05-12 after the SystemExit-suppression fix appeared not to work in `examples/hello_colab.py` because the shebang resolved to a uv-tool install pinned to the prior commit. 9. **Isolate the Regression First**: When a user reports an error in code you just touched, do NOT assume your change caused it. First, reproduce the failure on `main` (or the branch point) to determine whether the bug is pre-existing. Only after confirming the regression is yours should you start debugging the new code. Encoded after spending a turn debugging "ADC broke `colab new`" only to discover `colab new --gpu A100` was already failing on `main` due to an A100-quota-vs-default issue unrelated to ADC. 10. **Live Probes Allocate Real Resources**: Probing the Colab API to debug an issue creates real, billable assignments — every successful POST `/tun/m/assign` reserves a VM. Prefer GET-only (read) probes whenever possible. For any state-mutating call, (a) record every endpoint you create as you go, and (b) clean up via `client.unassign(endpoint)` (or `colab stop`) before declaring the investigation done. Then verify with `colab sessions` that nothing was orphaned. 11. **Push Freshness**: The remote may have advanced during a session (other contributors land commits while you work). ALWAYS `git fetch ` immediately before pushing or merging. If `git log main../main` is non-empty, reset local `main` to the remote, rebase feature branches onto it, retest, then push. NEVER force-push `main` to recover from divergence. @@ -65,6 +65,7 @@ 20. **Suggest the branch-diff review command after committing**: The user reviews changes with `git diff main..` (full cumulative diff against `main`, not just the latest commit). After landing one or more commits on a feature branch, ALWAYS suggest the exact command — e.g. "Review with `git diff main..sort-help-commands`" — instead of `git show ` (which only shows a single commit and misses context when a branch has multiple commits). Encoded 2026-05-05 after suggesting `git show 9d9c7da` for a branch the user wanted to review holistically. 21. **Verify research-tool claims with primary sources**: Research tools and AI assistants can be confidently wrong, especially about edge cases or features outside their training corpus. When such a tool says "X is not used / not parsed / doesn't exist", treat it as a hypothesis to verify, not a fact. Always cross-check against the primary source (the actual code or config) — and when a tool names files, check whether the indirection chain it describes actually exists. The cost of believing the tool when it's wrong is shipping a non-functional feature; the cost of double-checking is small. Encoded 2026-05-05 after a `colab url` first-cut shipped the wrong URL format because of unverified output. 22. **Clean up orphaned assignments before finishing live tests**: After running a live integration test, `colab sessions` may show server-side assignments that the local `colab stop` couldn't see (e.g. assignments leaked from earlier in the session, or from crashed prior runs). Always run `colab sessions` as the final cleanup step, and for any `[?]`-marked orphan, run `python -c "from colab_cli.common import state; state.client.unassign('')"` (the CLI doesn't expose a direct unassign-by-endpoint command). Re-verify with `colab sessions` returning "No active sessions". Encoded 2026-05-05 after the first `colab url` live test left an orphan from a prior conversation turn that would have idled-out and billed compute units. +23. **Forward unknown args through Typer with `context_settings`**: Typer/Click consumes any token starting with `-`/`--` as a flag of the parent command unless told otherwise. For a subcommand like `colab run script.py --some-script-flag` (where `--some-script-flag` belongs to the user's script, not to `colab`), declare it with `app.command(name="run", context_settings={"allow_extra_args": True, "ignore_unknown_options": True})` and accept the positional with `Annotated[Optional[List[str]], typer.Argument(...)] = None`. Also use `repr()` (not f-string interpolation) when embedding those forwarded strings into kernel-side Python source — `repr()` produces a safe round-trippable literal regardless of the user's shell-passed quotes, backslashes, or non-ASCII bytes. Encoded 2026-05-12 while implementing `colab run`. ## Agent Execution Limitations (What I Can vs Cannot Run) As an AI agent operating via non-interactive shell tools (`run_shell_command`), there are strict limits on what I can test autonomously without human intervention: diff --git a/docs/05_run_command.md b/docs/05_run_command.md new file mode 100644 index 0000000..8e987e8 --- /dev/null +++ b/docs/05_run_command.md @@ -0,0 +1,90 @@ +--- +log: +2026-05-12: Initial design and implementation of `colab run [args...]`. Combines `colab new` + `colab exec` + `colab stop` into a single fire-and-forget invocation so a Python file can use `#!/usr/bin/env -S colab run` as a shebang line and execute on a freshly-allocated Colab VM. Adds `--keep` (skip auto-stop), `--gpu` / `--tpu` (passthrough to session creation), `-s/--session` (name the ephemeral session), and propagates the script's exit status (non-zero on any uncaught exception in the kernel). The script's `sys.argv` is re-set inside the kernel to mirror native `python script.py arg1 arg2` semantics, and `__name__` is set to `"__main__"`. +2026-05-12: Native CPython exit-code semantics for `sys.exit()` / `raise SystemExit(...)` from the script body. The Colab kernel reports a `SystemExit` as `output_type=='error'`, which under the previous logic would have (a) printed the IPython traceback (`An exception has occurred, use %tb...`) and (b) flagged the run as a failure regardless of the integer exit code. Now: `sys.exit()` / `sys.exit(0)` exit 0 silently; `sys.exit(N)` exits N; `sys.exit('msg')` exits 1 (matching CPython). The IPython "To exit: use 'exit', 'quit', or Ctrl-D." UserWarning is filtered via the prelude. Encoded after running `examples/gpu_hello.py` end-to-end and seeing the noisy `SystemExit: 0` traceback at the end of an otherwise-successful GPU run. +--- + +# Design: `colab run` — Shebang-Compatible One-Shot Execution + +## Motivation +Inspired by the `llm` shebang pattern (https://til.simonwillison.net/llms/llm-shebang), users should be able to write a single self-contained Python file with a shebang line that: + +1. Allocates a Colab VM according to user-supplied flags (CPU / GPU / TPU). +2. Executes the body of the file on that VM. +3. Tears the VM down when execution finishes — UNLESS told otherwise. + +This is the natural ergonomic top-end of `colab-cli`: no boilerplate, no stale sessions, a single file is the unit of work. + +## User Surface + +``` +colab run [OPTIONS] SCRIPT [SCRIPT_ARGS]... +``` + +| Flag | Type | Default | Purpose | +|---|---|---|---| +| `SCRIPT` | positional | — | Local path to a `.py` file. Required. | +| `SCRIPT_ARGS` | variadic | — | Extra args forwarded to the script as `sys.argv[1:]`. | +| `-s`, `--session` | str | auto | Name the ephemeral session (helpful with `--keep`). Auto-generated as `run-<6 hex>` if omitted. | +| `--gpu` | str | None | Same set as `colab new --gpu` (T4, L4, G4, H100, A100). | +| `--tpu` | str | None | Same set as `colab new --tpu` (v5e1, v6e1). | +| `--keep` | bool | False | Do **not** stop the session after the script finishes. | + +### Shebang usage +With `--keep` and `--gpu` baked into the shebang line, an entire one-file workload becomes: + +```python +#!/usr/bin/env -S colab run --gpu T4 +import torch +print(torch.cuda.get_device_name(0)) +``` + +`chmod +x` and `./script.py` is then a single-step "rent a GPU, run, return". + +> The `-S` flag of `env` is necessary on Linux/macOS to allow multiple words after `colab run` in a shebang line; without it the kernel passes the whole tail as one argument. + +## Behavior + +1. **Allocate**: Creates a fresh session (mirrors `colab new` end-to-end: `assign` → keep-alive pre-flight → spawn keep-alive daemon → persist `SessionState`). Session name defaults to `run-<6 hex>`. +2. **Execute**: Reads the script file. Prepends a deterministic prelude that re-sets `sys.argv` and `__name__` so the script body sees the same execution context as `python script.py arg1 arg2`: + ```python + import sys + sys.argv = ['', 'arg1', 'arg2', ...] + __name__ = '__main__' + ``` + Then executes the script body in the same kernel cell so any `if __name__ == "__main__":` guard fires. +3. **Detect failure**: If the kernel returns any output of `output_type == "error"` (uncaught exception, syntax error, etc.) the CLI exits non-zero. +4. **Tear down**: In a `finally` block, unless `--keep` was passed, the CLI: + - Sends `runtime.stop(shutdown_kernel=True)` (best-effort). + - Calls `state.client.unassign(endpoint)` to free the billable VM. + - Removes the session from `StateStore`. + - Kills the keep-alive daemon (`kill_process(s.keep_alive_pid)`). + - Logs `session_terminated` with `reason="run_completed"` (or `"run_failed"`). + +If `--keep` is set, the session remains visible in `colab sessions` and `colab status` and can be reused with `colab exec -s `, `colab repl -s `, etc., until the user runs `colab stop` (or the keep-alive daemon hits its 24h cap). + +## AGENTS.md Constraints Honoured +- **Item 7 (no background threads)**: The keep-alive daemon is the existing detached process from `colab new`; this command introduces no new threads. +- **Item 10 (live probes allocate real resources)**: The teardown is in a `try/finally` so an exception during execution still releases the VM. Tests assert `unassign` is called even when the script errors. +- **Item 16 (daemon flag propagation)**: Reuses `spawn_keep_alive(...)` which already propagates `--auth` and `--config`. +- **Item 17 (persist-before-spawn)**: Uses the same persist-before-spawn pattern as `colab new`. + +## Testing Strategy (TDD) + +### Unit tests (`tests/test_run.py`) +1. **`test_run_basic_flow`** — Happy path: create session, execute script, unassign on exit. Mocks `client.assign`, `client.unassign`, `ColabRuntime`. Asserts unassign is called. +2. **`test_run_keep_skips_unassign`** — With `--keep`, `unassign` is NOT called and the session remains in the store. +3. **`test_run_passes_argv`** — `colab run script.py a b c` results in a kernel `execute_code` call whose payload contains `sys.argv = ['script.py', 'a', 'b', 'c']`. +4. **`test_run_sets_dunder_main`** — The execute payload contains `__name__ = '__main__'`. +5. **`test_run_propagates_error_exit_code`** — When `runtime.execute_code` returns an output of `output_type == "error"`, the CLI exits non-zero AND still calls `unassign`. +6. **`test_run_with_gpu_flag`** — `colab run --gpu T4 script.py` calls `client.assign(..., variant=GPU, accelerator=T4)`. +7. **`test_run_missing_script_errors`** — `colab run` with no script path errors out (Typer-level). +8. **`test_run_nonexistent_script_errors_before_assign`** — `colab run does-not-exist.py` MUST exit non-zero **without** calling `client.assign` so users don't burn a VM on a typo. +9. **`test_run_unassign_called_on_exception_during_execute`** — If `runtime.execute_code` raises, unassign is still called (try/finally guarantee). + +### Integration test (`integration/repro_run_command/test.sh`) +- Write a tiny script that prints its argv and exits 0. +- Run `colab run /tmp/script.py hello world`. +- Assert stdout contains `argv=['script.py', 'hello', 'world']`. +- Assert `colab sessions` returns "No active sessions" afterward (cleanup happened). +- Repeat with `--keep`: assert the session shows up in `colab sessions`, then call `colab stop` to clean up. diff --git a/integration/repro_run_command/test.sh b/integration/repro_run_command/test.sh new file mode 100644 index 0000000..7cb1c95 --- /dev/null +++ b/integration/repro_run_command/test.sh @@ -0,0 +1,132 @@ +#!/bin/bash +# Copyright 2026 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Integration Test: `colab run [args...]` +# +# Verifies the shebang-friendly one-shot execution flow: +# 1. `colab run` allocates a CPU VM, runs the script, releases the VM. +# 2. `sys.argv` and `__name__ == "__main__"` are honored. +# 3. After the run finishes, no orphan VMs remain. +# 4. `colab run --keep` leaves the session alive; `colab stop` clears it. + +# Don't `set -e` so we can capture failures and clean up explicitly. + +# ---------- Auth detection (mirrors integration/repro_keep_alive/test.sh) ---- +if [ -f "$HOME/.config/colab-cli/token.json" ]; then + AUTH_FLAGS="--auth=oauth2" +elif command -v gcloud > /dev/null && gcloud auth application-default print-access-token > /dev/null 2>&1; then + ADC_TOKEN=$(gcloud auth application-default print-access-token 2>/dev/null) + ADC_SCOPES=$(curl -s "https://www.googleapis.com/oauth2/v3/tokeninfo?access_token=$ADC_TOKEN" | python3 -c "import json,sys; print(json.load(sys.stdin).get('scope',''))" 2>/dev/null) + if echo "$ADC_SCOPES" | grep -q "colaboratory" && echo "$ADC_SCOPES" | grep -q "userinfo.email"; then + AUTH_FLAGS="--auth=adc" + else + echo "Error: ADC token lacks the required scopes (colaboratory + userinfo.email)." + echo "Re-issue ADC creds with all required scopes:" + echo " gcloud auth application-default login \\" + echo " --scopes=openid,\\" + echo " https://www.googleapis.com/auth/cloud-platform,\\" + echo " https://www.googleapis.com/auth/userinfo.email,\\" + echo " https://www.googleapis.com/auth/colaboratory" + exit 1 + fi +else + echo "Error: No usable auth provider found." + exit 1 +fi +echo "[*] Using $AUTH_FLAGS" + +# ---------- Isolated session state ------------------------------------------- +TMP_DIR=$(mktemp -d) +SESSION_FILE="$TMP_DIR/sessions.json" +SCRIPT_PATH="$TMP_DIR/script.py" +KEEP_SESSION_NAME="repro-run-keep-$(date +%s)" + +cleanup() { + echo "[*] Cleaning up..." + uv run colab $AUTH_FLAGS --config "$SESSION_FILE" stop -s "$KEEP_SESSION_NAME" 2>/dev/null || true + rm -rf "$TMP_DIR" +} +trap cleanup EXIT + +cat > "$SCRIPT_PATH" <<'PYEOF' +import sys +print(f"argv={sys.argv}") +print(f"is_main={__name__ == '__main__'}") +PYEOF +SCRIPT_BASENAME=$(basename "$SCRIPT_PATH") + +# ---------- Phase 1: basic run + auto-cleanup -------------------------------- +echo "[*] Phase 1: colab run hello world" +OUTPUT=$(uv run colab $AUTH_FLAGS --config "$SESSION_FILE" run "$SCRIPT_PATH" hello world 2>&1) +RC=$? +echo "$OUTPUT" + +if [ $RC -ne 0 ]; then + echo "[FAILURE] colab run exited $RC" + exit 1 +fi +if ! echo "$OUTPUT" | grep -q "argv=\['$SCRIPT_BASENAME', 'hello', 'world'\]"; then + echo "[FAILURE] argv was not propagated as expected." + echo " Wanted substring: argv=['$SCRIPT_BASENAME', 'hello', 'world']" + exit 1 +fi +if ! echo "$OUTPUT" | grep -q "is_main=True"; then + echo "[FAILURE] __name__ was not set to '__main__'." + exit 1 +fi + +# Verify cleanup actually happened — no orphan assignments remain. +SESSIONS_OUT=$(uv run colab $AUTH_FLAGS --config "$SESSION_FILE" sessions 2>&1) +echo "$SESSIONS_OUT" +if ! echo "$SESSIONS_OUT" | grep -q "No active sessions found on server."; then + echo "[FAILURE] After auto-cleanup, server still reports active sessions." + echo " (Possible orphan VM — investigate.)" + exit 1 +fi +echo "[SUCCESS] Phase 1 passed: argv passthrough, __main__, and auto-cleanup." + +# ---------- Phase 2: --keep leaves the session alive ------------------------- +echo "" +echo "[*] Phase 2: colab run --keep -s $KEEP_SESSION_NAME " +OUTPUT=$(uv run colab $AUTH_FLAGS --config "$SESSION_FILE" run --keep -s "$KEEP_SESSION_NAME" "$SCRIPT_PATH" keep_arg 2>&1) +RC=$? +echo "$OUTPUT" + +if [ $RC -ne 0 ]; then + echo "[FAILURE] colab run --keep exited $RC" + exit 1 +fi +if ! echo "$OUTPUT" | grep -q "argv=\['$SCRIPT_BASENAME', 'keep_arg'\]"; then + echo "[FAILURE] --keep run did not produce the expected argv output." + exit 1 +fi + +SESSIONS_OUT=$(uv run colab $AUTH_FLAGS --config "$SESSION_FILE" sessions 2>&1) +echo "$SESSIONS_OUT" +if ! echo "$SESSIONS_OUT" | grep -q "\[$KEEP_SESSION_NAME\]"; then + echo "[FAILURE] --keep session $KEEP_SESSION_NAME not found in colab sessions." + exit 1 +fi + +uv run colab $AUTH_FLAGS --config "$SESSION_FILE" stop -s "$KEEP_SESSION_NAME" +SESSIONS_OUT=$(uv run colab $AUTH_FLAGS --config "$SESSION_FILE" sessions 2>&1) +if ! echo "$SESSIONS_OUT" | grep -q "No active sessions found on server."; then + echo "[FAILURE] After manual stop of $KEEP_SESSION_NAME, sessions remain." + exit 1 +fi + +echo "[SUCCESS] Phase 2 passed: --keep persists the session, manual stop clears it." +echo "[SUCCESS] All phases passed." +exit 0 diff --git a/src/colab_cli/cli.py b/src/colab_cli/cli.py index 918ba77..c01ffb3 100644 --- a/src/colab_cli/cli.py +++ b/src/colab_cli/cli.py @@ -23,7 +23,7 @@ from colab_cli import auto_update from colab_cli.auth import AuthProvider from colab_cli.common import state, setup_logging -from colab_cli.commands import session, execution, files, automation, utility +from colab_cli.commands import session, execution, files, automation, run, utility class AlphabeticalGroup(TyperGroup): @@ -139,6 +139,7 @@ def help_command( execution.register(app) files.register(app) automation.register(app) +run.register(app) utility.register(app) diff --git a/src/colab_cli/commands/run.py b/src/colab_cli/commands/run.py new file mode 100644 index 0000000..2955ce6 --- /dev/null +++ b/src/colab_cli/commands/run.py @@ -0,0 +1,471 @@ +# Copyright 2026 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +`colab run [args...]` — shebang-friendly one-shot execution. + +Combines `colab new` + `colab exec` + `colab stop` into a single fire-and-forget +invocation. The Python script's body runs in a freshly-allocated Colab kernel +with `sys.argv` set as if it had been invoked via `python script.py [args...]`, +and the VM is automatically released when the script finishes (unless `--keep` +is passed). + +Designed to support shebangs: + + #!/usr/bin/env -S colab run --gpu T4 + import torch + print(torch.cuda.get_device_name(0)) + +See docs/05_run_command.md for the full design. +""" + +import datetime +import os +import uuid +from typing import List, Optional + +import typer +from typing_extensions import Annotated + +from colab_cli.client import ( + Accelerator, + ColabRequestError, + PostAssignmentResponse, + Variant, +) +from colab_cli.commands.session import ( + _is_scope_error, + _scope_remediation_message, + spawn_keep_alive, +) +from colab_cli.runtime import ColabRuntime +from colab_cli.state import SessionState +from colab_cli.utils import get_status_code, is_terminal_error + + +# TODO(sethtroisi): dedupe this logic with similar in session.py +def _resolve_accelerator(gpu: Optional[str], tpu: Optional[str]): + """Mirror the mapping logic in `commands.session.new`. Centralised so the + two commands stay in lock-step on supported accelerator names. + """ + if tpu: + variant = Variant.TPU + accelerator = Accelerator.V5E1 if tpu.lower() == "v5e1" else Accelerator.V6E1 + return variant, accelerator + if gpu: + mapping = { + "a100": Accelerator.A100, + "h100": Accelerator.H100, + "l4": Accelerator.L4, + "t4": Accelerator.T4, + "g4": Accelerator.G4, + } + return Variant.GPU, mapping.get(gpu.lower(), Accelerator.A100) + return Variant.DEFAULT, Accelerator.NONE + + +def _build_script_payload(script_path: str, script_args: List[str]) -> str: + """Wrap the script body so it executes with native-`python`-like semantics. + + Specifically: + - `sys.argv = [, *script_args]` so `argparse` etc. work. + - `__name__ = '__main__'` so `if __name__ == "__main__":` guards fire. + - Suppress the IPython UserWarning "To exit: use 'exit', 'quit', or + Ctrl-D." which fires whenever the script calls `sys.exit(...)`. This + warning is meaningful in an interactive REPL, but for `colab run` it + is pure noise that doesn't appear when running `python script.py`. + + The script body is appended verbatim; the prelude is short so any + traceback line numbers from user code remain close to the original. + """ + basename = os.path.basename(script_path) + with open(script_path, "r", encoding="utf-8") as f: + body = f.read() + + # `repr()` produces a safe, round-trippable Python literal for arbitrary + # strings (handles quotes, backslashes, non-ASCII). + argv_literal = f"[{', '.join(repr(x) for x in [basename] + script_args)}]" + + return ( + "import sys, warnings\n" + f"sys.argv = {argv_literal}\n" + "__name__ = '__main__'\n" + "warnings.filterwarnings('ignore', message=\"To exit: use\")\n" + + _strip_shebang(body) + ) + + +def _strip_shebang(body: str) -> str: + """Remove a leading `#!...\\n` if present. The remote kernel doesn't need + or understand it (it's a contract between the local kernel and the file's + executable bit), and leaving it in just adds noise. + """ + if body.startswith("#!"): + nl = body.find("\n") + return body[nl + 1 :] if nl != -1 else "" + return body + + +def _is_systemexit(out) -> bool: + """True iff this output is a `raise SystemExit(...)` (a.k.a. `sys.exit`).""" + return out.get("output_type") == "error" and out.get("ename") == "SystemExit" + + +def _systemexit_code(out) -> int: + """Map a SystemExit kernel output back to a CPython-style integer exit code. + + CPython conventions (mirrored): + - `sys.exit()` / `sys.exit(None)` / `sys.exit(0)` -> 0 + - `sys.exit()` -> + - `sys.exit('msg')` (any non-int) -> 1 + """ + evalue = (out.get("evalue") or "").strip() + if evalue in ("", "None", "0"): + return 0 + try: + return int(evalue) + except ValueError: + return 1 + + +def _exit_code_from_outputs(outputs) -> int: + """Derive the CLI's exit code from the kernel's outputs for a single cell. + + A `SystemExit` is treated like CPython would treat the same call from a + plain `python script.py` invocation. Any *other* error (uncaught + exception, NameError, etc.) is exit 1. + """ + code = 0 + for o in outputs: + if o.get("output_type") != "error": + continue + if _is_systemexit(o): + ec = _systemexit_code(o) + # Last SystemExit wins, matching the runtime — and any non-zero + # eclipses any prior zero. + code = ec if ec != 0 else code + else: + return 1 + return code + + +def _make_run_output_hook(output_image=None): + """Build an `output_hook` for `runtime.execute_code` that: + - Routes normal output to `display_output` (stream/image/error). + - Suppresses the `SystemExit` traceback so `sys.exit(0)` is silent (it + wouldn't print anything under `python script.py` either) and + `sys.exit(N)` doesn't dump a noisy IPython-styled traceback when the + intent is "shell exit code N". + + The kernel still RETURNS the SystemExit output to us (so we can derive the + exit code in `_exit_code_from_outputs`); we just don't render it. + """ + # Imported here to avoid a circular import via execution.py at module load. + from colab_cli.commands.execution import display_output + + def hook(out): + if _is_systemexit(out): + return + display_output(out, output_image) + + return hook + + +def run_command( + ctx: typer.Context, + script: Annotated[ + str, + typer.Argument( + help="Path to a local Python file to execute on a fresh Colab VM." + ), + ], + script_args: Annotated[ + Optional[List[str]], + typer.Argument( + help=( + "Arguments forwarded to the script as sys.argv[1:]. " + "Anything after the script path is passed through verbatim." + ), + ), + ] = None, + session: Annotated[ + Optional[str], + typer.Option( + "-s", + "--session", + help=( + "Name for the ephemeral session (auto-generated if omitted). " + "Useful with --keep so you can attach later via `colab exec -s `." + ), + ), + ] = None, + tpu: Annotated[ + Optional[str], + typer.Option(help="TPU accelerator variant. Supported: v5e1, v6e1."), + ] = None, + gpu: Annotated[ + Optional[str], + typer.Option( + help=( + "GPU accelerator variant. Supported: T4, L4, G4, H100, A100. " + "If omitted (along with --tpu), a CPU runtime is created." + ), + ), + ] = None, + keep: Annotated[ + bool, + typer.Option( + "--keep", + help=( + "Do not stop the session after the script finishes. The session " + "remains in `colab sessions` until you run `colab stop`." + ), + ), + ] = False, +): + """Run a Python script on a fresh Colab VM, then release the VM. + + Designed to be used as a shebang interpreter, e.g. + + #!/usr/bin/env -S colab run --gpu T4 + + so a single executable .py file can rent a GPU, run, and clean up after + itself. + """ + from colab_cli.common import state + + script_args = script_args or [] + + # AGENTS.md item 10: validate locally BEFORE allocating a VM. A typo'd + # script path should not cost the user real compute. + if not os.path.isfile(script): + typer.echo(f"[colab] Script not found: {script}", err=True) + raise typer.Exit(2) + + name = session or f"run-{uuid.uuid4().hex[:6]}" + variant, accelerator = _resolve_accelerator(gpu, tpu) + + typer.echo(f"[colab] Creating session '{name}'...", err=True) + try: + res = state.client.assign( + uuid.uuid4(), variant=variant, accelerator=accelerator + ) + except ColabRequestError as e: + # Mirror `colab new`'s friendly accelerator-quota message. + if get_status_code(e) == 400 and accelerator != Accelerator.NONE: + typer.echo( + f"[colab] Backend rejected accelerator '{accelerator.value}'. " + "You may not have quota or entitlement for this accelerator on " + "your account. Try a different one (e.g. --gpu T4) or omit " + "--gpu/--tpu for a CPU runtime.", + err=True, + ) + raise typer.Exit(code=1) + raise + + if isinstance(res, PostAssignmentResponse): + token = res.runtime_proxy_info.token + url = res.runtime_proxy_info.url + endpoint = res.endpoint + else: + token = ( + res.runtime_proxy_info.token + if hasattr(res, "runtime_proxy_info") + else getattr(res, "runtime_proxy_token", "") + ) + url = res.runtime_proxy_info.url if hasattr(res, "runtime_proxy_info") else "" + endpoint = res.endpoint + + s = SessionState( + name=name, + token=token, + url=url, + endpoint=endpoint, + variant=variant.value, + accelerator=accelerator.value, + ) + + # Pre-flight keep-alive: same scope-detection dance as `colab new` so a + # missing OAuth scope doesn't leak a billable assignment. + try: + state.client.keep_alive_assignment(endpoint) + except ColabRequestError as e: + if get_status_code(e) == 403 and _is_scope_error(e): + typer.echo( + "[colab] Keep-alive pre-flight failed: your OAuth " + "credentials are missing the 'colaboratory' scope, which " + "is required by the Colab RuntimeService.\n", + err=True, + ) + typer.echo(_scope_remediation_message(state.auth_provider), err=True) + try: + state.client.unassign(endpoint) + except Exception: + pass + raise typer.Exit(code=1) + # Other failures: don't block — the daemon will retry. + + # AGENTS.md item 17: persist BEFORE spawning the daemon so the daemon's + # initial state.store.get(name) doesn't race the parent. + state.store.add(s) + s.keep_alive_pid = spawn_keep_alive( + endpoint, + name, + auth_provider=state.auth_provider, + config_path=state.config_path, + ) + state.store.add(s) + state.history.log_event( + name, + "session_created", + { + "endpoint": endpoint, + "variant": variant.value, + "accelerator": accelerator.value, + "via": "run", + }, + ) + typer.echo(f"[colab] Session READY ({name}). Executing {script}...", err=True) + + # ----- Execute the script ------------------------------------------------- + exit_code = 0 + cleanup_reason = "run_completed" + + def on_started(kid): + s.kernel_id = kid + state.store.add(s) + + def on_sess_started(sid): + s.session_id = sid + state.store.add(s) + + runtime = ColabRuntime( + s.url, + s.token, + kernel_id=s.kernel_id, + session_id=s.session_id, + on_kernel_started=on_started, + on_session_started=on_sess_started, + ) + + try: + # Same /content prelude as `colab exec` for consistency. + try: + runtime.execute_code( + "import os; os.makedirs('/content', exist_ok=True); " + "os.chdir('/content')" + ) + except Exception as e: + if is_terminal_error(e): + typer.echo( + f"[colab] Session '{name}' appears to be lost (404/401).", + err=True, + ) + state.prune_session(name) + raise typer.Exit(1) + raise + + payload = _build_script_payload(script, script_args) + s.running = f"run({os.path.basename(script)})" + s.last_execution = ( + script, + None, + datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), + ) + state.store.add(s) + + try: + outputs = runtime.execute_code(payload, output_hook=_make_run_output_hook()) + except Exception: + # Genuine transport-level failure. Cleanup still happens via the + # outer finally; surface non-zero exit so callers/CI notice. + exit_code = 1 + cleanup_reason = "run_failed" + raise + else: + exit_code = _exit_code_from_outputs(outputs) + if exit_code != 0: + cleanup_reason = "run_failed" + state.history.log_event( + name, + "execution", + {"code": payload, "outputs": outputs, "via": "run"}, + ) + finally: + s.running = None + state.store.add(s) + # Best-effort runtime close (keeps remote kernel alive for --keep). + try: + runtime.stop() + except Exception: + pass + + if not keep: + _teardown(name, s, reason=cleanup_reason) + + if exit_code != 0: + raise typer.Exit(exit_code) + + +def _teardown(name: str, s: SessionState, *, reason: str) -> None: + """Best-effort full session teardown: kill the keep-alive daemon, ask the + remote kernel to shut down, unassign the VM, and remove local state. + + Mirrors `commands.session.stop` but with a richer history event reason and + swallowing all errors (we don't want a teardown failure to mask the user's + exit code). + """ + from colab_cli.common import kill_process, state + + typer.echo(f"[colab] Stopping session '{name}'...", err=True) + if s.keep_alive_pid: + try: + kill_process(s.keep_alive_pid) + except Exception: + pass + + try: + rt = ColabRuntime(s.url, s.token, kernel_id=s.kernel_id) + rt.stop(shutdown_kernel=True) + except Exception: + pass + + try: + state.client.unassign(s.endpoint) + except Exception: + pass + + try: + state.store.remove(name) + except Exception: + pass + + try: + state.history.log_event(name, "session_terminated", {"reason": reason}) + except Exception: + pass + typer.echo("[colab] Session terminated.", err=True) + + +def register(app: typer.Typer) -> None: + # `context_settings` lets unknown args after the script path flow through + # as positional `script_args` so users can pass `--flags-for-the-script` + # without Typer trying to consume them. + app.command( + name="run", + context_settings={ + "allow_extra_args": True, + "ignore_unknown_options": True, + }, + )(run_command) diff --git a/tests/conftest.py b/tests/conftest.py index 30209b8..9481ef5 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -34,5 +34,6 @@ def mock_common_state(mocker): mocker.patch("colab_cli.commands.session.ColabRuntime") mocker.patch("colab_cli.commands.execution.ColabRuntime") mocker.patch("colab_cli.commands.automation.ColabRuntime") + mocker.patch("colab_cli.commands.run.ColabRuntime") return mock_state diff --git a/tests/test_run.py b/tests/test_run.py new file mode 100644 index 0000000..b52da41 --- /dev/null +++ b/tests/test_run.py @@ -0,0 +1,506 @@ +# Copyright 2026 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Tests for `colab run [args...]` — shebang-friendly one-shot +execution that bundles `colab new` + `colab exec` + `colab stop`. +""" + +from unittest.mock import MagicMock + +import pytest +from typer.testing import CliRunner + +from colab_cli.cli import app +from colab_cli.client import ( + Accelerator, + PostAssignmentResponse, + Variant, +) + +runner = CliRunner() + + +@pytest.fixture +def mock_client(mock_common_state): + return mock_common_state.client + + +@pytest.fixture +def mock_store(mock_common_state): + return mock_common_state.store + + +@pytest.fixture +def mock_runtime_class(mocker): + """Patch ColabRuntime in the run module specifically.""" + return mocker.patch("colab_cli.commands.run.ColabRuntime") + + +@pytest.fixture +def mock_spawn_keep_alive(mocker): + """Don't actually spawn a daemon during tests.""" + return mocker.patch("colab_cli.commands.run.spawn_keep_alive", return_value=12345) + + +@pytest.fixture +def assign_response(): + """A minimal PostAssignmentResponse-shaped mock for client.assign.""" + res = MagicMock() + res.__class__ = PostAssignmentResponse + res.runtime_proxy_info.token = "tok" + res.runtime_proxy_info.url = "http://runtime" + res.endpoint = "ep-123" + return res + + +@pytest.fixture +def script_path(tmp_path): + p = tmp_path / "script.py" + p.write_text("print('hello from script')\n") + return p + + +# --------------------------------------------------------------------------- +# Happy path +# --------------------------------------------------------------------------- + + +def test_run_basic_flow( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """`colab run script.py` should: assign, exec, unassign.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + # Simulate the persisted SessionState being readable by the run command. + persisted = {} + + def store_add(s): + persisted["s"] = s + + def store_get(name): + return persisted.get("s") + + mock_store.add.side_effect = store_add + mock_store.get.side_effect = store_get + + result = runner.invoke(app, ["run", str(script_path)]) + + assert result.exit_code == 0, result.output + # Allocation happened + mock_client.assign.assert_called_once() + # Script body was executed (the prelude + body is one execute_code call) + code_calls = [c.args[0] for c in mock_runtime.execute_code.call_args_list] + assert any("hello from script" in code for code in code_calls), ( + f"Script body never sent to runtime. Calls: {code_calls}" + ) + # Cleanup happened + mock_client.unassign.assert_called_once_with("ep-123") + + +# --------------------------------------------------------------------------- +# --keep flag +# --------------------------------------------------------------------------- + + +def test_run_keep_skips_unassign( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """With `--keep`, the session must NOT be unassigned after the script + finishes — the user wants to attach to it later.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", "--keep", str(script_path)]) + + assert result.exit_code == 0, result.output + mock_client.assign.assert_called_once() + mock_client.unassign.assert_not_called() + mock_store.remove.assert_not_called() + + +# --------------------------------------------------------------------------- +# argv passthrough +# --------------------------------------------------------------------------- + + +def test_run_passes_argv( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """Args after the script must be exposed as `sys.argv` inside the kernel.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke( + app, ["run", str(script_path), "alpha", "beta", "--flag-for-script"] + ) + + assert result.exit_code == 0, result.output + code_calls = [c.args[0] for c in mock_runtime.execute_code.call_args_list] + # The execute_code call that contains the script body must also set + # sys.argv to mirror native python invocation. + body_calls = [c for c in code_calls if "hello from script" in c] + assert body_calls, f"Body never executed. Calls: {code_calls}" + body = body_calls[0] + assert "sys.argv" in body + assert "'script.py'" in body + assert "'alpha'" in body + assert "'beta'" in body + assert "'--flag-for-script'" in body + + +def test_run_sets_dunder_main( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """The script must run with __name__ == '__main__'.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + assert result.exit_code == 0, result.output + code_calls = [c.args[0] for c in mock_runtime.execute_code.call_args_list] + body = next(c for c in code_calls if "hello from script" in c) + assert "__name__" in body and "'__main__'" in body + + +# --------------------------------------------------------------------------- +# Error handling +# --------------------------------------------------------------------------- + + +def test_run_propagates_error_exit_code( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """If the kernel reports an error, the CLI must exit non-zero AND still + unassign the VM (try/finally guarantee — AGENTS.md item 10).""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + + def execute_with_error(code, output_hook=None, **kwargs): + outputs = [ + { + "output_type": "error", + "ename": "ValueError", + "evalue": "boom", + "traceback": ["Traceback...\n", "ValueError: boom\n"], + } + ] + if output_hook: + for o in outputs: + output_hook(o) + return outputs + + mock_runtime.execute_code.side_effect = execute_with_error + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + assert result.exit_code != 0 + # Cleanup MUST happen even on script failure. + mock_client.unassign.assert_called_once_with("ep-123") + + +def test_run_unassign_called_on_exception_during_execute( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """Even if `runtime.execute_code` raises (e.g. websocket dies), the VM + must be released.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.side_effect = RuntimeError("websocket closed") + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + assert result.exit_code != 0 + mock_client.unassign.assert_called_once_with("ep-123") + + +# --------------------------------------------------------------------------- +# Accelerator passthrough +# --------------------------------------------------------------------------- + + +def test_run_with_gpu_flag( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """`colab run --gpu T4 script.py` must request a T4 GPU.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", "--gpu", "T4", str(script_path)]) + assert result.exit_code == 0, result.output + + _, kwargs = mock_client.assign.call_args + assert kwargs["variant"] is Variant.GPU + assert kwargs["accelerator"] is Accelerator.T4 + + +def test_run_with_tpu_flag( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """`colab run --tpu v5e1 script.py` must request a TPU.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", "--tpu", "v5e1", str(script_path)]) + assert result.exit_code == 0, result.output + + _, kwargs = mock_client.assign.call_args + assert kwargs["variant"] is Variant.TPU + assert kwargs["accelerator"] is Accelerator.V5E1 + + +# --------------------------------------------------------------------------- +# Argument validation — fail FAST, before allocating a VM +# --------------------------------------------------------------------------- + + +def test_run_missing_script_errors(mock_client): + """Typer should reject the invocation if no script path is given.""" + result = runner.invoke(app, ["run"]) + assert result.exit_code != 0 + mock_client.assign.assert_not_called() + + +def test_run_nonexistent_script_errors_before_assign(mock_client): + """If the script doesn't exist locally, fail BEFORE allocating a VM — + otherwise a typo would burn billable compute.""" + result = runner.invoke(app, ["run", "/no/such/file.py"]) + assert result.exit_code != 0 + mock_client.assign.assert_not_called() + + +# --------------------------------------------------------------------------- +# SystemExit handling — the kernel reports `sys.exit(N)` as an error output of +# `ename=='SystemExit'`. We want native-`python`-like semantics: exit 0 for +# `SystemExit(0)` (no traceback printed), and propagate the integer for +# `SystemExit(N)`. +# --------------------------------------------------------------------------- + + +def _systemexit_output(evalue: str): + """Shape of the kernel's error output for `raise SystemExit()`.""" + return { + "output_type": "error", + "ename": "SystemExit", + "evalue": evalue, + "traceback": [ + "An exception has occurred, use %tb to see the full traceback.\n", + f"\x1b[0;31mSystemExit\x1b[0m\x1b[0;31m:\x1b[0m {evalue}\n", + ], + } + + +def test_run_systemexit_zero_treated_as_success( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, + capfd, +): + """`raise SystemExit(0)` from the script body must NOT make the CLI exit + non-zero, AND the SystemExit traceback must NOT be printed (it's noise + that doesn't appear when running `python script.py`).""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + + def execute_with_systemexit(code, output_hook=None, **kwargs): + outputs = [_systemexit_output("0")] + if output_hook: + for o in outputs: + output_hook(o) + return outputs + + mock_runtime.execute_code.side_effect = execute_with_systemexit + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + captured = capfd.readouterr() + + assert result.exit_code == 0, result.output + # The IPython "An exception has occurred..." traceback must be suppressed. + assert "An exception has occurred" not in ( + result.output + result.stderr + captured.out + captured.err + ) + # Cleanup still happened. + mock_client.unassign.assert_called_once_with("ep-123") + + +def test_run_systemexit_nonzero_propagates_code( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """`raise SystemExit(7)` from the script must surface as exit code 7 + (matching `python script.py` semantics).""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + + def execute_with_systemexit(code, output_hook=None, **kwargs): + outputs = [_systemexit_output("7")] + if output_hook: + for o in outputs: + output_hook(o) + return outputs + + mock_runtime.execute_code.side_effect = execute_with_systemexit + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + assert result.exit_code == 7 + mock_client.unassign.assert_called_once_with("ep-123") + + +def test_run_systemexit_string_message_exits_one( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """`sys.exit('boom')` (string arg, like `python -c "import sys; sys.exit(\"x\")"`) + must (a) exit non-zero (CPython uses 1) and (b) print the message so the + user sees what went wrong.""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + + def execute_with_systemexit(code, output_hook=None, **kwargs): + outputs = [_systemexit_output("boom")] + if output_hook: + for o in outputs: + output_hook(o) + return outputs + + mock_runtime.execute_code.side_effect = execute_with_systemexit + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + assert result.exit_code == 1 + mock_client.unassign.assert_called_once_with("ep-123") + + +def test_run_prelude_suppresses_ipython_exit_warning( + mock_client, + mock_store, + mock_runtime_class, + mock_spawn_keep_alive, + assign_response, + script_path, +): + """The prelude must mute IPython's 'To exit: use exit, quit, or Ctrl-D' + UserWarning, which fires whenever the user calls `sys.exit(...)` (i.e. + every well-formed CLI script).""" + mock_client.assign.return_value = assign_response + mock_runtime = mock_runtime_class.return_value + mock_runtime.execute_code.return_value = [] + + persisted = {} + mock_store.add.side_effect = lambda s: persisted.setdefault("s", s) + mock_store.get.side_effect = lambda name: persisted.get("s") + + result = runner.invoke(app, ["run", str(script_path)]) + assert result.exit_code == 0, result.output + + # Find the body-bearing execute_code call. + code_calls = [c.args[0] for c in mock_runtime.execute_code.call_args_list] + body = next(c for c in code_calls if "hello from script" in c) + # Look for the warnings filter targeting IPython's exit-warning text. + assert "warnings.filterwarnings" in body + assert "To exit: use" in body