Correct content-awareness framing with a direct content-specificity eval by SolshineCode · Pull Request #16 · SolshineCode/nla-gemma-4-e2b

SolshineCode · 2026-06-01T07:20:21Z

What this does

Adds a direct content-specificity retrieval eval of the v0.1 AV and corrects the published content-awareness framing based on what it shows.

The eval

Tests whether the AV text recovers the source document it came from (the prior "content-blind" claim only measured the AR round-trip). Doc-level retrieval (chance 1/13) + 5-way LLM-judge forced choice (chance 0.20), 5000-iter permutation nulls.

All probes at chance: TF-IDF word/char, semantic (MiniLM), tail-window, non-template subset, Claude Haiku judge (0.24, n=50), Claude Sonnet judge (0.267, n=30). Semantic + any-connection judge prompt rule out the "real-but-non-topical feature" rescue (personality / event / tone).

Doc corrections

Replaces the mild "theme-correct" overclaim with "format/genre-plausible but not per-row content- or theme-discriminative" in README, MODEL_CARD_AV, and a new RELEASE_CALIBRATION addendum. The "detail-confabulated" half stands.
Keeps the genuine nuance: v0.1 output is diverse (45/50 unique strings); the diversity is decoupled from source content.
Notes the one honest scope-limit (a feature constant across all 13 docs would be invisible to this eval).
Fixes the v0.0.1 AV SFT-step count in MODEL_CARD_AV: 15 → 55 total (15 base + 40 continuation), grad_accum 16 → 4, restoring the value dropped in the 2026-05-17 retraction rewrite (commit 8dd0a99).

Data

Bundles all eval scripts, inputs (rl.parquet, AV outputs), and per-trial LLM-judge data under experiments/v8_nla_local/ so the suite is reproducible from this repo. Full writeup: experiments/v8_nla_local/CONTENT_SPECIFICITY_EVAL.md.

🤖 Generated with Claude Code

A direct retrieval eval (does each AV output recover its own source document?) puts the v0.1 AV at chance across lexical, semantic, and two LLM-judge probes. So the output is format/genre-plausible but NOT per-row content- or theme-discriminative. Corrects the mild 'theme-correct' overclaim in README / model card / calibration doc; the 'detail-confabulated' half stands. Nuance kept: v0.1 output is diverse (45/50 unique strings); the diversity is decoupled from source content. Scope-limit noted (a feature constant across all 13 docs would be invisible to this eval). Bundles all eval scripts, inputs (rl.parquet, AV outputs), per-trial LLM-judge data, and CONTENT_SPECIFICITY_EVAL.md under experiments/v8_nla_local/ so the suite is reproducible from this repo. Also fixes v0.0.1 AV SFT-step count in MODEL_CARD_AV (15 -> 55 total = 15 base + 40 continuation; grad_accum 16 -> 4), restoring the value dropped in the 2026-05-17 retraction rewrite (commit 8dd0a99). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

SolshineCode · 2026-06-01T07:20:27Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive content-specificity retrieval evaluation suite to determine if the v0.1 Natural Language Autoencoder (NLA) text outputs carry any recoverable content signal from their source documents. It adds several evaluation scripts, documentation, and results demonstrating that the NLA outputs are format-plausible but not content-discriminative. The feedback on the new Python scripts focuses on improving robustness and resource management. Specifically, it is recommended to check subprocess return codes in the LLM judge script to prevent silent failures, use context managers (with statements) for all file read/write operations, and gracefully handle the potential absence of the sentence_transformers package in the refinement script.

gemini-code-assist · 2026-06-01T07:22:09Z

+def call_judge_claude(prompt, model=MODEL):
+    claude = shutil.which("claude") or "claude"
+    proc = subprocess.run(
+        [claude, "-p", "--model", model],
+        input=prompt, capture_output=True, text=True, cwd=HOME, timeout=300,
+        encoding="utf-8", errors="replace",
+    )
+    return proc.stdout.strip()
+
+
+def call_judge_gemini(prompt):
+    # Antigravity (Gemini 3.5 Flash) via the gemini-collab wrapper; prompt passed as argv.
+    proc = subprocess.run(
+        [sys.executable, GEMINI_WRAPPER, "--prompt", prompt, "--timeout", "300"],
+        capture_output=True, text=True, encoding="utf-8", errors="replace", timeout=360,
+    )
+    return proc.stdout.strip()


The subprocess calls to the LLM judges currently fail silently if the command returns a non-zero exit code or fails to produce a valid response. This leads to silent failures where the script continues executing with empty results, eventually writing a report claiming 'no signal above chance' with 0 trials (as seen in the generated content_specificity_judge_gemini.json file). Adding a check for proc.returncode and raising an error with proc.stderr will prevent these silent failures and make debugging much easier.

def call_judge_claude(prompt, model=MODEL): claude = shutil.which("claude") or "claude" proc = subprocess.run( [claude, "-p", "--model", model], input=prompt, capture_output=True, text=True, cwd=HOME, timeout=300, encoding="utf-8", errors="replace", ) if proc.returncode != 0: raise RuntimeError(f"Claude judge failed with exit code {proc.returncode}: {proc.stderr}") return proc.stdout.strip() def call_judge_gemini(prompt): # Antigravity (Gemini 3.5 Flash) via the gemini-collab wrapper; prompt passed as argv. proc = subprocess.run( [sys.executable, GEMINI_WRAPPER, "--prompt", prompt, "--timeout", "300"], capture_output=True, text=True, encoding="utf-8", errors="replace", timeout=360, ) if proc.returncode != 0: raise RuntimeError(f"Gemini judge failed with exit code {proc.returncode}: {proc.stderr}") return proc.stdout.strip()

gemini-code-assist · 2026-06-01T07:22:09Z

+def load():
+    df = pd.read_parquet(RL)
+    av = json.load(open(AV, encoding="utf-8"))["our_outputs"]


The file AV is opened using open() but is never explicitly closed, which can lead to resource leaks. It is recommended to use a with statement (context manager) to ensure the file is properly closed after reading.

Suggested change

def load():

df = pd.read_parquet(RL)

av = json.load(open(AV, encoding="utf-8"))["our_outputs"]

def load():

df = pd.read_parquet(RL)

with open(AV, encoding="utf-8") as f:

av = json.load(f)["our_outputs"]

gemini-code-assist · 2026-06-01T07:22:09Z

+        results["config"]["semantic"] = "sentence-transformers not installed; skipped"
+        print("[semantic] skipped (sentence-transformers not installed)")
+
+    json.dump(results, open(OUT, "w", encoding="utf-8"), indent=1)


The output file OUT is opened for writing without a context manager, leaving the file descriptor open. Using a with statement ensures the file is closed properly.

Suggested change

json.dump(results, open(OUT, "w", encoding="utf-8"), indent=1)

with open(OUT, "w", encoding="utf-8") as f:

json.dump(results, f, indent=1)

gemini-code-assist · 2026-06-01T07:22:09Z

+def build_trials():
+    df = pd.read_parquet(RL)
+    av = json.load(open(AV, encoding="utf-8"))["our_outputs"]


The file AV is opened using open() but is never explicitly closed. Using a with statement ensures the file is properly closed after reading.

Suggested change

def build_trials():

df = pd.read_parquet(RL)

av = json.load(open(AV, encoding="utf-8"))["our_outputs"]

def build_trials():

df = pd.read_parquet(RL)

with open(AV, encoding="utf-8") as f:

av = json.load(f)["our_outputs"]

gemini-code-assist · 2026-06-01T07:22:09Z

+        "per_trial": [{"trial_id": tid, "choice": ch, "true_pos": tp, "correct": ch == tp}
+                      for tid, ch, tp in scored],
+    }
+    json.dump(result, open(OUT, "w", encoding="utf-8"), indent=1)


The output file OUT is opened for writing without a context manager. Using a with statement ensures the file is closed properly.

with open(OUT, "w", encoding="utf-8") as f: json.dump(result, f, indent=1)

gemini-code-assist · 2026-06-01T07:22:09Z

+def main():
+    df = pd.read_parquet(RL)
+    av_all = json.load(open(AV, encoding="utf-8"))["our_outputs"]


The file AV is opened using open() but is never explicitly closed. Using a with statement ensures the file is properly closed after reading.

Suggested change

def main():

df = pd.read_parquet(RL)

av_all = json.load(open(AV, encoding="utf-8"))["our_outputs"]

def main():

df = pd.read_parquet(RL)

with open(AV, encoding="utf-8") as f:

av_all = json.load(f)["our_outputs"]

gemini-code-assist · 2026-06-01T07:22:09Z

+              f"nontmpl top1={r['R2_nontemplate_tail']['doc_top1']:.3f}(p{r['R2_nontemplate_tail']['top1_p']:.2f}) "
+              f"| gap_p full={r['R1_full_doc']['gap_p']:.2f} tail={r['R1_tail_window']['gap_p']:.2f} nontmpl={r['R2_nontemplate_tail']['gap_p']:.2f}")
+
+    json.dump(out, open(OUT, "w", encoding="utf-8"), indent=1)


The output file OUT is opened for writing without a context manager. Using a with statement ensures the file is closed properly.

Suggested change

json.dump(out, open(OUT, "w", encoding="utf-8"), indent=1)

with open(OUT, "w", encoding="utf-8") as f:

json.dump(out, f, indent=1)

gemini-code-assist · 2026-06-01T07:22:09Z

+def make_sim(av, src, kind):
+    if kind == "tfidf":
+        vec = TfidfVectorizer(ngram_range=(1, 2), min_df=1, sublinear_tf=True,
+                              stop_words="english").fit(src + av)
+        return cosine_similarity(vec.transform(av), vec.transform(src))
+    if kind == "semantic":
+        from sentence_transformers import SentenceTransformer
+        model = SentenceTransformer("all-MiniLM-L6-v2")
+        ea = model.encode(av, normalize_embeddings=True, show_progress_bar=False)
+        es = model.encode(src, normalize_embeddings=True, show_progress_bar=False)
+        return ea @ es.T


Unlike eval_content_specificity.py, this script does not handle the absence of sentence_transformers gracefully. If the package is not installed, running this script will crash with a ModuleNotFoundError. Wrapping the import in a try-except block and returning None allows the script to skip the semantic evaluation gracefully, matching the behavior of the main evaluation script.

Suggested change

def make_sim(av, src, kind):

if kind == "tfidf":

vec = TfidfVectorizer(ngram_range=(1, 2), min_df=1, sublinear_tf=True,

stop_words="english").fit(src + av)

return cosine_similarity(vec.transform(av), vec.transform(src))

if kind == "semantic":

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

ea = model.encode(av, normalize_embeddings=True, show_progress_bar=False)

es = model.encode(src, normalize_embeddings=True, show_progress_bar=False)

return ea @ es.T

def make_sim(av, src, kind):

if kind == "tfidf":

vec = TfidfVectorizer(ngram_range=(1, 2), min_df=1, sublinear_tf=True,

stop_words="english").fit(src + av)

return cosine_similarity(vec.transform(av), vec.transform(src))

if kind == "semantic":

try:

from sentence_transformers import SentenceTransformer

except ImportError:

return None

model = SentenceTransformer("all-MiniLM-L6-v2")

ea = model.encode(av, normalize_embeddings=True, show_progress_bar=False)

es = model.encode(src, normalize_embeddings=True, show_progress_bar=False)

return ea @ es.T

gemini-code-assist · 2026-06-01T07:22:09Z

+    for kind in ["tfidf", "semantic"]:
+        # R1: full vs tail-window source
+        sim_full = make_sim(av_all, full_src, kind)
+        sim_tail = make_sim(av_all, tail_src, kind)
+        # R2: non-template subset, tail source
+        av_k = [av_all[i] for i in keep]
+        tail_k = [tail_src[i] for i in keep]
+        doc_k = [doc[i] for i in keep]
+        sim_k = make_sim(av_k, tail_k, kind)
+        out["results"][kind] = {
+            "R1_full_doc": perm(sim_full, doc),
+            "R1_tail_window": perm(sim_tail, doc),
+            "R2_nontemplate_tail": perm(sim_k, doc_k),
+        }
+        r = out["results"][kind]
+        print(f"[{kind}] full top1={r['R1_full_doc']['doc_top1']:.3f}(p{r['R1_full_doc']['top1_p']:.2f}) "
+              f"tail top1={r['R1_tail_window']['doc_top1']:.3f}(p{r['R1_tail_window']['top1_p']:.2f}) "
+              f"nontmpl top1={r['R2_nontemplate_tail']['doc_top1']:.3f}(p{r['R2_nontemplate_tail']['top1_p']:.2f}) "
+              f"| gap_p full={r['R1_full_doc']['gap_p']:.2f} tail={r['R1_tail_window']['gap_p']:.2f} nontmpl={r['R2_nontemplate_tail']['gap_p']:.2f}")


To support graceful skipping of the semantic evaluation when sentence_transformers is not installed, update the loop in main to check if sim_full is None and skip the current iteration if so.

for kind in ["tfidf", "semantic"]: # R1: full vs tail-window source sim_full = make_sim(av_all, full_src, kind) if sim_full is None: print(f"[{kind}] skipped (sentence-transformers not installed)") continue sim_tail = make_sim(av_all, tail_src, kind) # R2: non-template subset, tail source av_k = [av_all[i] for i in keep] tail_k = [tail_src[i] for i in keep] doc_k = [doc[i] for i in keep] sim_k = make_sim(av_k, tail_k, kind) out["results"][kind] = { "R1_full_doc": perm(sim_full, doc), "R1_tail_window": perm(sim_tail, doc), "R2_nontemplate_tail": perm(sim_k, doc_k), } r = out["results"][kind] print(f"[{kind}] full top1={r['R1_full_doc']['doc_top1']:.3f}(p{r['R1_full_doc']['top1_p']:.2f}) " f"tail top1={r['R1_tail_window']['doc_top1']:.3f}(p{r['R1_tail_window']['top1_p']:.2f}) " f"nontmpl top1={r['R2_nontemplate_tail']['doc_top1']:.3f}(p{r['R2_nontemplate_tail']['top1_p']:.2f}) " f"| gap_p full={r['R1_full_doc']['gap_p']:.2f} tail={r['R1_tail_window']['gap_p']:.2f} nontmpl={r['R2_nontemplate_tail']['gap_p']:.2f}")

gemini-code-assist

Code Review

This pull request introduces a comprehensive content-specificity retrieval evaluation suite for the v0.1 Natural Language Autoencoder (NLA), adding several evaluation scripts, documentation, and benchmark results. The findings show that while the AV outputs are diverse, they perform at chance in recovering source-document content, prompting updates to the README, model card, and calibration notes. The reviewer feedback is highly constructive, recommending robustness and portability improvements such as handling subprocess timeouts gracefully in the LLM-judge script, avoiding hardcoded home directory paths, and utilizing context managers for safer file I/O operations.

gemini-code-assist · 2026-06-01T07:23:36Z

+    for b in range(0, len(trials), batch_size):
+        batch = trials[b:b + batch_size]
+        prompt = make_prompt(batch)
+        resp = call_judge(prompt)
+        parsed = parse(resp)
+        choices.update(parsed)
+        with open(RAW, "a", encoding="utf-8") as f:
+            f.write(json.dumps({"batch_start": b, "prompt_chars": len(prompt),
+                                "response": resp, "parsed": parsed}) + "\n")
+        print(f"batch {b}-{b+len(batch)}: parsed {len(parsed)}/{len(batch)}")


The LLM judge API call can occasionally time out or fail due to network issues, rate limits, or subprocess hangs (as noted in the Sonnet run crash). Currently, any exception (such as subprocess.TimeoutExpired) raised during call_judge will crash the entire script, causing all previously completed batches in the loop to be lost since the final results are never written to OUT.\n\nWrapping the call_judge and parsing logic in a try-except block and breaking/continuing gracefully will allow the script to save the completed trials to OUT even if a subsequent batch fails.

for b in range(0, len(trials), batch_size):\n batch = trials[b:b + batch_size]\n prompt = make_prompt(batch)\n try:\n resp = call_judge(prompt)\n parsed = parse(resp)\n choices.update(parsed)\n with open(RAW, "a", encoding="utf-8") as f:\n f.write(json.dumps({"batch_start": b, "prompt_chars": len(prompt),\n "response": resp, "parsed": parsed}) + "\\n")\n print(f"batch {b}-{b+len(batch)}: parsed {len(parsed)}/{len(batch)}")\n except subprocess.TimeoutExpired as e:\n print(f"Timeout on batch {b}-{b+len(batch)}: {e}")\n break\n except Exception as e:\n print(f"Error on batch {b}-{b+len(batch)}: {e}")\n break

gemini-code-assist · 2026-06-01T07:23:36Z

+GEMINI_WRAPPER = os.path.join(HOME, ".claude", "skills", "gemini-collab",
+                              "scripts", "gemini_client.py")


Hardcoding the path to gemini_client.py inside the user's home directory (~/.claude/skills/...) makes the script non-portable and fragile across different developer environments or CI/CD pipelines. Consider allowing this path to be overridden via an environment variable or command-line argument, with the current path as a default fallback.

GEMINI_WRAPPER = os.environ.get(\n "GEMINI_WRAPPER",\n os.path.join(HOME, ".claude", "skills", "gemini-collab", "scripts", "gemini_client.py")\n)

gemini-code-assist · 2026-06-01T07:23:36Z

+
+def load():
+    df = pd.read_parquet(RL)
+    av = json.load(open(AV, encoding="utf-8"))["our_outputs"]


Opening files without a with statement (context manager) leaves the file descriptor open until garbage collection runs. It is safer and more idiomatic to use with open(...) to guarantee the file is closed properly.

with open(AV, encoding="utf-8") as f:\n av = json.load(f)["our_outputs"]

gemini-code-assist · 2026-06-01T07:23:36Z

+        results["config"]["semantic"] = "sentence-transformers not installed; skipped"
+        print("[semantic] skipped (sentence-transformers not installed)")
+
+    json.dump(results, open(OUT, "w", encoding="utf-8"), indent=1)


Using a context manager (with statement) ensures that the output file is properly closed and flushed, even if an exception occurs during writing.

with open(OUT, "w", encoding="utf-8") as f:\n json.dump(results, f, indent=1)

… activation Follow-up to the content-specificity eval. A ceiling test on the RAW L23 activations recovers the source doc well above chance (retrieval 0.24 vs 0.077, p=0.0006; logistic probe 60% on 13-way doc id), so the content the AV fails to surface IS present in the activation. The per-row gap is the verbalizer's, and is fixable with better AV training rather than an intrinsic 2B-L23 ceiling. This also refutes the polysemanticity-at-2B reading for doc-level content, and validates the retrieval method (strong signal in activation, none in AV output). 8-layer sweep: every layer content-rich; L23 middling (0.350), L17 best (0.805), so a future NLA could also retarget the layer. Updates README + RELEASE_CALIBRATION framing (more accurate and more hopeful) and bundles the new scripts + result JSONs (inject probe, activation ceiling, layer sweep). Adds a single-token forced-choice injection probe (also at chance, non-degenerate). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…m is real) The ceiling result reframes the content gap as a verbalizer-training problem, not a model-scale ceiling (60% linear-probe doc id on the raw L23 activation). Update the bottom-line section so the summary carries both halves: the v0.1 AV does not yet read per-row content, AND the information is there to be read. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

SolshineCode and others added 2 commits June 1, 2026 01:29

SolshineCode merged commit d58b340 into main Jun 2, 2026

SolshineCode mentioned this pull request Jun 3, 2026

Reframe public tone: verbalizer training gap, not content-blind #17

Closed

	json.dump(results, open(OUT, "w", encoding="utf-8"), indent=1)
	with open(OUT, "w", encoding="utf-8") as f:
	json.dump(results, f, indent=1)

	json.dump(out, open(OUT, "w", encoding="utf-8"), indent=1)
	with open(OUT, "w", encoding="utf-8") as f:
	json.dump(out, f, indent=1)

		GEMINI_WRAPPER = os.path.join(HOME, ".claude", "skills", "gemini-collab",
		"scripts", "gemini_client.py")

Conversation

SolshineCode commented Jun 1, 2026

What this does

The eval

Doc corrections

Data

Uh oh!

SolshineCode commented Jun 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant