Skip to content

[WIP] test: add gemini cli skill content preservation test#401

Open
minzznguyen wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
minzznguyen:unit_test_gemini_cli
Open

[WIP] test: add gemini cli skill content preservation test#401
minzznguyen wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
minzznguyen:unit_test_gemini_cli

Conversation

@minzznguyen
Copy link
Copy Markdown

Add a new integration test to verify that skill contents are preserved when copied to the fake home sandbox.

TAG=agy
CONV=a151643e-4527-4bbe-9908-a1f4f33be7e3

Add a new integration test to verify that skill contents are preserved
when copied to the fake home sandbox.

TAG=agy
CONV=a151643e-4527-4bbe-9908-a1f4f33be7e3
@minzznguyen minzznguyen requested a review from IsmailMehdi as a code owner May 21, 2026 22:07
Copy link
Copy Markdown
Collaborator

@IsmailMehdi IsmailMehdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice instinct adding test coverage for _setup_skills — there's not much for the generator boundary today, and the integration test in particular (test_skill_content_preserved) is a good idea since tmp_path lets you exercise real file ops safely. A few things should be fixed before this lands.

Blocking

  1. Missing sys.path setup — the file likely doesn't import from the repo root. Every other test in evalbench/test/ (e.g. trajectory_matcher_test.py:7, tool_naming_test.py:9) prepends the parent dir to sys.path so from generators.models... resolves:

    sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

    Without this, pytest evalbench/test/gemini_cli_test.py from the repo root will fail with ModuleNotFoundError: No module named 'generators'. Please add it, or run the test yourself from the repo root (uv run pytest evalbench/test/gemini_cli_test.py) to confirm it imports.

  2. print() statements throughout. There are ~40 prints used as narration. pytest captures stdout by default, so they only appear on failure or with -s, and they make the test body 3× longer than it needs to be. Behavior-level asserts already say what's being checked. Please strip them — for the rare case where a future debugger needs them back, pytest -v plus a good assertion message does the same job.

  3. try/except AssertionError: raise e around mock assertions. pytest's introspection already produces a clear diff for assert_called_once_with. Wrapping in try/except adds noise and raise e (vs bare raise) truncates the traceback. Replace with the bare assertion call.

Medium

  1. Log-message coupling is brittle. mock_logging.info.assert_any_call("Syncing skill: my-single-skill") will break the next time someone tweaks that log line. The mock_copytree.assert_called_once_with(...) assertion above it is the behavioral check that matters — drop the log assertion.

  2. Hardcoded /home/jamesamn in test_setup_single_skill_string. Looks like a leftover from local iteration. Use a generic value like /fake/real_home (or str(tmp_path)) — accidentally fine, but a reviewer notices.

Minor

  1. Inconsistent env-mocking style. Test 1 uses patch.dict(os.environ, {"HOME": ...}), test 2 uses monkeypatch.setenv. Pick one — monkeypatch is the pytest-native idiom and is already in test 2.

  2. test_skill_content_preserved is really a setup-integration test. It does exercise content preservation, but it relies on _setup_npm_auth and _setup not blowing up either — worth a docstring noting the broader scope, or split out a narrower variant that calls _setup_skills directly with a pre-built generator if you want a true unit test of the copy step.

  3. Worth covering at least one of the dict skill_config branches (link/install/enable/disable/uninstall) — they're the lion's share of _setup_skills and currently untested. subprocess.run is already mocked, so adding one is cheap.

The integration approach in test 2 is good — once you trim the prints and add the sys.path fix, that's a solid sandbox for further skill-setup tests.


Cleaned-up version

Here's the file with blocking + medium fixes applied — drop in if it's easier than editing piecemeal:

import os
import sys

import pytest
from unittest.mock import patch, MagicMock

sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

from generators.models.gemini_cli import GeminiCliGenerator


@patch('generators.models.gemini_cli.logging')
@patch('generators.models.gemini_cli.subprocess.run')
@patch('generators.models.gemini_cli.os.path.exists')
@patch('generators.models.gemini_cli.shutil.copytree')
@patch('generators.models.gemini_cli.shutil.rmtree')
def test_setup_single_skill_string(
    mock_rmtree,
    mock_copytree,
    mock_exists,
    mock_run,
    mock_logging,
):
    """A string entry in setup.skills triggers a copytree from real to fake home."""
    real_home = "/fake/real_home"
    real_skill_path = os.path.join(real_home, ".gemini", "skills", "my-single-skill")

    mock_exists.side_effect = lambda path: path == real_skill_path
    mock_run.return_value = MagicMock(returncode=0, stdout="fake-token")

    config = {"setup": {"skills": ["my-single-skill"]}}

    with patch('generators.models.gemini_cli.os.makedirs'), \
         patch('generators.models.gemini_cli.open', create=True), \
         patch.dict(os.environ, {"HOME": real_home}):
        GeminiCliGenerator(config)

    expected_fake_home = os.path.abspath(os.path.join(".venv", "fake_home"))
    expected_fake_skill_path = os.path.join(
        expected_fake_home, ".gemini", "skills", "my-single-skill"
    )
    mock_copytree.assert_called_once_with(real_skill_path, expected_fake_skill_path)


@patch('generators.models.gemini_cli.subprocess.run')
def test_skill_content_preserved(mock_run, tmp_path, monkeypatch):
    """End-to-end: a skill file's contents survive copy into the fake-home sandbox.

    Exercises the full GeminiCliGenerator.__init__ -> _setup -> _setup_skills
    path with real filesystem operations confined to tmp_path. Only subprocess
    (gcloud auth) is mocked.
    """
    real_home = tmp_path / "real_home"
    real_home.mkdir()

    skill_name = "my-content-skill"
    real_skill_dir = real_home / ".gemini" / "skills" / skill_name
    real_skill_dir.mkdir(parents=True)

    skill_file = real_skill_dir / "secret.txt"
    expected_content = "hello world"
    skill_file.write_text(expected_content)

    mock_run.return_value = MagicMock(returncode=0, stdout="fake-token")

    monkeypatch.chdir(tmp_path)
    monkeypatch.setenv("HOME", str(real_home))

    GeminiCliGenerator({"setup": {"skills": [skill_name]}})

    expected_fake_skill_file = (
        tmp_path / ".venv" / "fake_home" / ".gemini" / "skills" / skill_name / "secret.txt"
    )
    assert expected_fake_skill_file.exists(), (
        f"Copied skill file not found at {expected_fake_skill_file}"
    )
    assert expected_fake_skill_file.read_text() == expected_content

(Also swapped the "password is xyz" sample content for a benign "hello world" so the file doesn't read like a credential.)

import os
import pytest
from unittest.mock import patch, MagicMock
from generators.models.gemini_cli import GeminiCliGenerator
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without sys.path.append(...) above this line, pytest from the repo root will fail with ModuleNotFoundError: No module named 'generators'. See other tests in this directory for the pattern.

Suggested change
from generators.models.gemini_cli import GeminiCliGenerator
from unittest.mock import patch, MagicMock
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from generators.models.gemini_cli import GeminiCliGenerator

"""Test that a single skill defined as a string is copied to fake home."""
print("\n--- STARTING TEST: test_setup_single_skill_string ---")

real_home = "/home/jamesamn"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a leftover from local iteration — please use a generic value.

Suggested change
real_home = "/home/jamesamn"
real_home = "/fake/real_home"

print("[Mock subprocess.run] Configured to simulate successful 'gcloud auth'")

# 3. Configure CWD and HOME env var to use temp directory
monkeypatch.chdir(tmp_path)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: "password is xyz" reads like a credential when grep'd. Suggest a benign placeholder so future scanners don't flag the test file.

Suggested change
monkeypatch.chdir(tmp_path)
expected_content = "hello world"

@IsmailMehdi
Copy link
Copy Markdown
Collaborator

Also take a look at the style issues.

@minzznguyen minzznguyen changed the title test: add gemini cli skill content preservation test [WIP] test: add gemini cli skill content preservation test May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants