Skip to content

feat(modal-qwen-vl-quad): add footer-strip call to populate comments_raw#36

Merged
jakebromberg merged 1 commit into
mainfrom
modal-qwen-vl-quad-footer-call
May 11, 2026
Merged

feat(modal-qwen-vl-quad): add footer-strip call to populate comments_raw#36
jakebromberg merged 1 commit into
mainfrom
modal-qwen-vl-quad-footer-call

Conversation

@jakebromberg
Copy link
Copy Markdown
Member

Summary

  • Add a 6th transcribe_qwen_vl.remote() call against the footer strip (image[body_bottom_y:, :]) so modal-qwen-vl-quad populates PageResult.comments_raw instead of always emitting None.
  • Mirror the header call's pattern: inline FOOTER_WIRE_SCHEMA dict, one-line _crop_footer_strip helper, terse FOOTER_EXTRACTION_PROMPT (verbatim, JSON null for blank, do not transcribe row content from above the Comments line), same try/except so a footer parse failure leaves comments_raw=None without failing the page.
  • Tests: extend the three test_modal_qwen_vl_quad_* cases to expect 6 calls and assert the 6th payload is wired correctly; add a footer-failure regression analogous to the header-failure case; add a _crop_footer_strip unit test parallel to _crop_header_strip's; add contract tests for FOOTER_EXTRACTION_PROMPT parallel to the header-prompt tests.

Test plan

  • ruff check . passes
  • ruff format --check . passes
  • mypy core cli.py passes
  • pytest passes (339 passed, 1 deselected)
  • One-page Modal smoke run produces a PageResult with comments_raw set when the band has content, None when blank (user-driven)

Closes #33

The per-quadrant adapter assembled a PageResult from a header call plus four quadrant calls; none of those crops sees the bottom Comments band because `_crop_quadrants` stops at `layout.body_bottom_y` so the printed `Comments:` line does not bleed into the bottom-quadrant transcriptions. With `PageResult.comments_raw` landed in #32 the adapter therefore emitted `comments_raw=None` on every page even when the band had real content. Adds a sixth `transcribe_qwen_vl.remote()` call against `image[body_bottom_y:, :]` inside the existing `with app.run():` block, mirroring the header call's fault-tolerance shape so a footer parse failure leaves `comments_raw=None` without failing the page. Introduces `_crop_footer_strip(image, layout)`, `FOOTER_WIRE_SCHEMA` (analogous to `HEADER_WIRE_SCHEMA` — small, inline, no Pydantic indirection), and `FOOTER_EXTRACTION_PROMPT` (mirrors `HEADER_EXTRACTION_PROMPT`'s scoping discipline: capture verbatim, JSON null for blank, never invent, do not transcribe row content from above the Comments line). The Modal container is warm for calls 2-6, so the cost delta is one warm forward pass per page.

Closes #33
@jakebromberg jakebromberg merged commit c050012 into main May 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

modal-qwen-vl-quad: add footer-strip call to populate comments_raw

1 participant