Skip to content

MAINT: Detangle scorer LLM round-trip — ResponseHandler + internal helper (1/3)#2125

Open
romanlutz wants to merge 3 commits into
microsoft:mainfrom
romanlutz:romanlutz-scorer-response-handler-refactor
Open

MAINT: Detangle scorer LLM round-trip — ResponseHandler + internal helper (1/3)#2125
romanlutz wants to merge 3 commits into
microsoft:mainfrom
romanlutz:romanlutz-scorer-response-handler-refactor

Conversation

@romanlutz

Copy link
Copy Markdown
Contributor

This is step 1 of 3 in a scorer-architecture refactor — "Option 5: Scorer-owned composition". Full design notes: design gist.

Follow-up to #1867, which bolted a response_parser hook onto SelfAskTrueFalseScorer for LlamaGuard and surfaced that the base Scorer class conflates concerns that should be separate (evaluation mechanism, system-prompt construction, and response parsing all live on the base, so every non-LLM scorer inherits machinery it can never use).

Scope of this PR — internal refactor, no public API change

No behavior change and no constructor changes. This purely rehomes internals so the base Scorer stops carrying LLM-only machinery.

  • ResponseHandler abstraction (pyrit/score/response_handler.py) — a ResponseHandler ABC plus a default JsonSchemaResponseHandler that owns turning the target's raw text into the parsed UnvalidatedScore shape (score_value, rationale/description, metadata, category). JsonSchemaResponseHandler reproduces today's parsing exactly — json.loads(remove_markdown_json(...)) plus the same key lookups, category/metadata normalization, and InvalidJsonException/ValueError behavior.
  • Stateless round-trip helper (pyrit/score/llm_scoring.py) — a module-internal _run_llm_scoring_async(*, chat_target, system_prompt, response_handler, value, ...) that performs the LLM round-trip previously living in Scorer._score_value_with_llm_async: sets the system prompt on the target, sends the prompt, applies the existing @pyrit_json_retry behavior, and delegates parsing to the handler.
  • Base Scorer slimmed — the retry decorator, JSON parsing, and system-prompt handling move off the base. Scorer._score_value_with_llm_async is now a thin internal forwarder to the helper, kept only so the not-yet-migrated scorers keep working. It is deleted in step 3.
  • FloatScaleScorer override removed — the old override only added a float() post-check on the parsed value; that check is folded into the helper via a _score_value_is_numeric flag, preserving behavior.
  • SelfAskTrueFalseScorer wired to call the helper directly. Its public constructor and behavior are unchanged.

The 3-step plan (per the gist)

  1. This PR — introduce the internal round-trip helper + ResponseHandler (JSON-schema default), rehome the _score_value_with_llm_async logic off the base Scorer, and wire SelfAskTrueFalseScorer. No public-API change.
  2. PR B — expose the composition on SelfAskTrueFalseScorer.__init__ (chat_target + system_prompt + response_handler) plus a TrueFalseQuestion value type; add the CallableResponseHandler escape hatch and pilot LlamaGuard. from_question_yaml becomes a deprecation shim.
  3. PR C — roll the pattern out to the remaining LLM scorers, one at a time, and delete the transitional base forwarder.

Composites/wrappers (TrueFalseCompositeScorer, Audio/VideoTrueFalseScorer, ConversationScorer, …) and all non-LLM scorers are unchanged.

Testing

  • uv run pytest tests/unit/score1232 passed, 16 skipped.
  • ruff, ty, and pre-commit all clean.

Copilot AI added 3 commits July 1, 2026 23:40
…_async

PR A of the scorer-architecture refactor (Option 5: scorer-owned composition).
Moves the LLM evaluation mechanism and JSON response parsing off the base Scorer
class so the base no longer carries LLM-only machinery (retry, system-prompt
setting, JSON parsing).

- Add pyrit/score/response_handler.py: ResponseHandler ABC + JsonSchemaResponseHandler
  (response parsing) reproducing the existing JSON parsing exactly.
- Add pyrit/score/llm_scoring.py: stateless module-level run_llm_scoring_async
  (evaluation mechanism) wrapping an inner @pyrit_json_retry round-trip; the optional
  numeric-value float check runs outside the retry, preserving the old
  FloatScaleScorer behavior.
- Convert Scorer._score_value_with_llm_async into a deprecated thin shim that forwards
  to run_llm_scoring_async (removed_in 0.17.0). Signature unchanged, so the eight
  not-yet-migrated scorers keep working (migrated in PR C).
- Remove the FloatScaleScorer._score_value_with_llm_async override; replace it with a
  _score_value_is_numeric class flag the shim reads to apply the float check.
- Wire SelfAskTrueFalseScorer to call run_llm_scoring_async directly. Public
  constructor and behavior unchanged.
- Export ResponseHandler, JsonSchemaResponseHandler, run_llm_scoring_async from
  pyrit.score (additive).
- Update test_scorer_remove_markdown_json_called patch target to the new module.

No public API change. All existing score tests pass; lints clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
For the upcoming v1.0.0 clean breaking change, PR A keeps `chat_target`
(renaming it to `target` was an unnecessary break) and ships no deprecation
warnings.

- run_llm_scoring_async / _send_and_parse_async: rename the `target` parameter
  back to `chat_target`; update both call sites (the base Scorer forwarder and
  SelfAskTrueFalseScorer).
- Scorer._score_value_with_llm_async: remove the print_deprecation_message call
  and its import. It is now a plain, warning-free internal forwarder kept only
  as a transitional shim until PR C migrates the remaining scorers and deletes
  it entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR A is an internal refactor with no public API change, so the shared
round-trip helper should not look like public API. Rename
run_llm_scoring_async -> _run_llm_scoring_async (matching the already-private
_send_and_parse_async) and drop it from pyrit.score's __init__ import and
__all__. Both internal callers import it directly from pyrit.score.llm_scoring.

ResponseHandler / JsonSchemaResponseHandler stay exported as the new
composition abstraction.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz changed the title Scorer architecture refactor (1/3): ResponseHandler + LLM round-trip helper MAINT: Detangle scorer LLM round-trip — ResponseHandler + internal helper (1/3) Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants