[WIP] test: add gemini cli skill content preservation test#401
[WIP] test: add gemini cli skill content preservation test#401minzznguyen wants to merge 1 commit into
Conversation
Add a new integration test to verify that skill contents are preserved when copied to the fake home sandbox. TAG=agy CONV=a151643e-4527-4bbe-9908-a1f4f33be7e3
IsmailMehdi
left a comment
There was a problem hiding this comment.
Nice instinct adding test coverage for _setup_skills — there's not much for the generator boundary today, and the integration test in particular (test_skill_content_preserved) is a good idea since tmp_path lets you exercise real file ops safely. A few things should be fixed before this lands.
Blocking
-
Missing
sys.pathsetup — the file likely doesn't import from the repo root. Every other test inevalbench/test/(e.g.trajectory_matcher_test.py:7,tool_naming_test.py:9) prepends the parent dir tosys.pathsofrom generators.models...resolves:sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
Without this,
pytest evalbench/test/gemini_cli_test.pyfrom the repo root will fail withModuleNotFoundError: No module named 'generators'. Please add it, or run the test yourself from the repo root (uv run pytest evalbench/test/gemini_cli_test.py) to confirm it imports. -
print()statements throughout. There are ~40 prints used as narration. pytest captures stdout by default, so they only appear on failure or with-s, and they make the test body 3× longer than it needs to be. Behavior-level asserts already say what's being checked. Please strip them — for the rare case where a future debugger needs them back,pytest -vplus a good assertion message does the same job. -
try/except AssertionError: raise earound mock assertions. pytest's introspection already produces a clear diff forassert_called_once_with. Wrapping in try/except adds noise andraise e(vs bareraise) truncates the traceback. Replace with the bare assertion call.
Medium
-
Log-message coupling is brittle.
mock_logging.info.assert_any_call("Syncing skill: my-single-skill")will break the next time someone tweaks that log line. Themock_copytree.assert_called_once_with(...)assertion above it is the behavioral check that matters — drop the log assertion. -
Hardcoded
/home/jamesamnintest_setup_single_skill_string. Looks like a leftover from local iteration. Use a generic value like/fake/real_home(orstr(tmp_path)) — accidentally fine, but a reviewer notices.
Minor
-
Inconsistent env-mocking style. Test 1 uses
patch.dict(os.environ, {"HOME": ...}), test 2 usesmonkeypatch.setenv. Pick one —monkeypatchis the pytest-native idiom and is already in test 2. -
test_skill_content_preservedis really a setup-integration test. It does exercise content preservation, but it relies on_setup_npm_authand_setupnot blowing up either — worth a docstring noting the broader scope, or split out a narrower variant that calls_setup_skillsdirectly with a pre-built generator if you want a true unit test of the copy step. -
Worth covering at least one of the
dictskill_config branches (link/install/enable/disable/uninstall) — they're the lion's share of_setup_skillsand currently untested.subprocess.runis already mocked, so adding one is cheap.
The integration approach in test 2 is good — once you trim the prints and add the sys.path fix, that's a solid sandbox for further skill-setup tests.
Cleaned-up version
Here's the file with blocking + medium fixes applied — drop in if it's easier than editing piecemeal:
import os
import sys
import pytest
from unittest.mock import patch, MagicMock
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from generators.models.gemini_cli import GeminiCliGenerator
@patch('generators.models.gemini_cli.logging')
@patch('generators.models.gemini_cli.subprocess.run')
@patch('generators.models.gemini_cli.os.path.exists')
@patch('generators.models.gemini_cli.shutil.copytree')
@patch('generators.models.gemini_cli.shutil.rmtree')
def test_setup_single_skill_string(
mock_rmtree,
mock_copytree,
mock_exists,
mock_run,
mock_logging,
):
"""A string entry in setup.skills triggers a copytree from real to fake home."""
real_home = "/fake/real_home"
real_skill_path = os.path.join(real_home, ".gemini", "skills", "my-single-skill")
mock_exists.side_effect = lambda path: path == real_skill_path
mock_run.return_value = MagicMock(returncode=0, stdout="fake-token")
config = {"setup": {"skills": ["my-single-skill"]}}
with patch('generators.models.gemini_cli.os.makedirs'), \
patch('generators.models.gemini_cli.open', create=True), \
patch.dict(os.environ, {"HOME": real_home}):
GeminiCliGenerator(config)
expected_fake_home = os.path.abspath(os.path.join(".venv", "fake_home"))
expected_fake_skill_path = os.path.join(
expected_fake_home, ".gemini", "skills", "my-single-skill"
)
mock_copytree.assert_called_once_with(real_skill_path, expected_fake_skill_path)
@patch('generators.models.gemini_cli.subprocess.run')
def test_skill_content_preserved(mock_run, tmp_path, monkeypatch):
"""End-to-end: a skill file's contents survive copy into the fake-home sandbox.
Exercises the full GeminiCliGenerator.__init__ -> _setup -> _setup_skills
path with real filesystem operations confined to tmp_path. Only subprocess
(gcloud auth) is mocked.
"""
real_home = tmp_path / "real_home"
real_home.mkdir()
skill_name = "my-content-skill"
real_skill_dir = real_home / ".gemini" / "skills" / skill_name
real_skill_dir.mkdir(parents=True)
skill_file = real_skill_dir / "secret.txt"
expected_content = "hello world"
skill_file.write_text(expected_content)
mock_run.return_value = MagicMock(returncode=0, stdout="fake-token")
monkeypatch.chdir(tmp_path)
monkeypatch.setenv("HOME", str(real_home))
GeminiCliGenerator({"setup": {"skills": [skill_name]}})
expected_fake_skill_file = (
tmp_path / ".venv" / "fake_home" / ".gemini" / "skills" / skill_name / "secret.txt"
)
assert expected_fake_skill_file.exists(), (
f"Copied skill file not found at {expected_fake_skill_file}"
)
assert expected_fake_skill_file.read_text() == expected_content(Also swapped the "password is xyz" sample content for a benign "hello world" so the file doesn't read like a credential.)
| import os | ||
| import pytest | ||
| from unittest.mock import patch, MagicMock | ||
| from generators.models.gemini_cli import GeminiCliGenerator |
There was a problem hiding this comment.
Without sys.path.append(...) above this line, pytest from the repo root will fail with ModuleNotFoundError: No module named 'generators'. See other tests in this directory for the pattern.
| from generators.models.gemini_cli import GeminiCliGenerator | |
| from unittest.mock import patch, MagicMock | |
| import sys | |
| sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) | |
| from generators.models.gemini_cli import GeminiCliGenerator |
| """Test that a single skill defined as a string is copied to fake home.""" | ||
| print("\n--- STARTING TEST: test_setup_single_skill_string ---") | ||
|
|
||
| real_home = "/home/jamesamn" |
There was a problem hiding this comment.
Looks like a leftover from local iteration — please use a generic value.
| real_home = "/home/jamesamn" | |
| real_home = "/fake/real_home" |
| print("[Mock subprocess.run] Configured to simulate successful 'gcloud auth'") | ||
|
|
||
| # 3. Configure CWD and HOME env var to use temp directory | ||
| monkeypatch.chdir(tmp_path) |
There was a problem hiding this comment.
Minor: "password is xyz" reads like a credential when grep'd. Suggest a benign placeholder so future scanners don't flag the test file.
| monkeypatch.chdir(tmp_path) | |
| expected_content = "hello world" |
|
Also take a look at the style issues. |
Add a new integration test to verify that skill contents are preserved when copied to the fake home sandbox.
TAG=agy
CONV=a151643e-4527-4bbe-9908-a1f4f33be7e3