Problem
The per-quadrant adapter (modal-qwen-vl-quad) assembles a PageResult from a header-strip call plus four quadrant calls. None of those crops sees the bottom Comments: band — _crop_quadrants explicitly stops at layout.body_bottom_y so the comments line doesn't bleed into the bottom-quadrant transcriptions (scripts/calibrate_models.py L605-L631). With GeminiPageResult.comments_raw now landed (#31, PR #32), the adapter therefore emits comments_raw=None on every page even when the band has real content. The Gemini production path picks the field up because it sees the whole page; only modal-qwen-vl-quad has the gap.
Desired end state
modal-qwen-vl-quad adds a 6th remote call against a footer-strip crop (image[body_bottom_y:, :]) and populates PageResult.comments_raw from the result. On parse failure or empty band, comments_raw falls back to None — consistent with how the header call handles missing page_date_raw.
Files
scripts/calibrate_models.py — make_modal_qwen_vl_quad_adapter (L411-L505), _crop_header_strip/_crop_quadrants (L599-L631). Add a _crop_footer_strip(image, layout) helper that crops (0, layout.body_bottom_y, w, h), a FOOTER_WIRE_SCHEMA (analogous to HEADER_WIRE_SCHEMA at L403, just {"comments_raw": str|null}), and a FOOTER_EXTRACTION_PROMPT in core/prompts.py that mirrors HEADER_EXTRACTION_PROMPT's scoping discipline (capture verbatim, JSON null for blank, don't transcribe row content that bled in from above).
core/prompts.py — add FOOTER_EXTRACTION_PROMPT. Three top-level prompts becomes four; the module docstring at the top needs a one-line update.
core/page_layout.py — read-only reference. PageLayout.body_bottom_y already exists (L106) and is the load-bearing boundary; no schema change needed there.
tests/unit/test_calibrate_models.py — extend the existing adapter dispatch tests (test_modal_qwen_vl_quad_*) so the mocked .remote() is called 6 times (was 5); add a footer-failure test analogous to the header-failure case.
tests/unit/test_prompts.py — contract tests for FOOTER_EXTRACTION_PROMPT (names comments_raw, says verbatim, says JSON null for blank, scopes to footer band, forbids invented content).
Suggested approach
- Failing test first for
_crop_footer_strip — a generated PIL.Image.new with a painted marker in the footer band; assert the crop captures it and excludes the body grid.
- Add
FOOTER_WIRE_SCHEMA + FOOTER_EXTRACTION_PROMPT. Mirror header-call style: short prompt, page-scoped instructions, "do not transcribe row content from above the Comments line."
- Wire the 6th
.remote() call inside the existing with app.run(): block, between the header call and the quadrant loop (or after the loop — doesn't matter functionally, but placing it next to the header call keeps the "page-level fields" cluster together). Same exception-tolerant pattern as the header call.
- Set
comments_raw=... on the constructed PageResult.
The Modal container is warm for calls 2-6, so the cost delta is one warm forward pass per page (~5-10s, fractions of a cent). No new cold-start risk.
Constraints
- Don't regress the header / quadrant calls. Wire the new call so a footer parse failure leaves the rest of the page intact — same try/except pattern as the header call.
- Keep prompt fences sharp. The footer crop will overlap the bottom-quadrant baseline by a few pixels in skewed scans (just like the header strip overlaps the top-quadrant baseline). The prompt must tell the model to ignore content above the
Comments: line. Otherwise the model will helpfully transcribe the last row of the bottom quadrants into comments_raw.
- Match the page-level adapter's behavior. Gemini's full-page path returns the comments contents verbatim and
null for blank; modal-qwen-vl-quad must end up doing the same so calibration A/B's are apples-to-apples.
Acceptance criteria
Related
Follow-up to #31 (closed by #32). Sibling structural fix to #19 (comments-line boundary detection), which is what made adding the 6th call cheap — layout.body_bottom_y already separates the footer band from the body grid.
Problem
The per-quadrant adapter (
modal-qwen-vl-quad) assembles aPageResultfrom a header-strip call plus four quadrant calls. None of those crops sees the bottomComments:band —_crop_quadrantsexplicitly stops atlayout.body_bottom_yso the comments line doesn't bleed into the bottom-quadrant transcriptions (scripts/calibrate_models.py L605-L631). WithGeminiPageResult.comments_rawnow landed (#31, PR #32), the adapter therefore emitscomments_raw=Noneon every page even when the band has real content. The Gemini production path picks the field up because it sees the whole page; onlymodal-qwen-vl-quadhas the gap.Desired end state
modal-qwen-vl-quadadds a 6th remote call against a footer-strip crop (image[body_bottom_y:, :]) and populatesPageResult.comments_rawfrom the result. On parse failure or empty band,comments_rawfalls back toNone— consistent with how the header call handles missingpage_date_raw.Files
scripts/calibrate_models.py—make_modal_qwen_vl_quad_adapter(L411-L505),_crop_header_strip/_crop_quadrants(L599-L631). Add a_crop_footer_strip(image, layout)helper that crops(0, layout.body_bottom_y, w, h), aFOOTER_WIRE_SCHEMA(analogous toHEADER_WIRE_SCHEMAat L403, just{"comments_raw": str|null}), and aFOOTER_EXTRACTION_PROMPTincore/prompts.pythat mirrorsHEADER_EXTRACTION_PROMPT's scoping discipline (capture verbatim, JSON null for blank, don't transcribe row content that bled in from above).core/prompts.py— addFOOTER_EXTRACTION_PROMPT. Three top-level prompts becomes four; the module docstring at the top needs a one-line update.core/page_layout.py— read-only reference.PageLayout.body_bottom_yalready exists (L106) and is the load-bearing boundary; no schema change needed there.tests/unit/test_calibrate_models.py— extend the existing adapter dispatch tests (test_modal_qwen_vl_quad_*) so the mocked.remote()is called 6 times (was 5); add a footer-failure test analogous to the header-failure case.tests/unit/test_prompts.py— contract tests forFOOTER_EXTRACTION_PROMPT(namescomments_raw, says verbatim, says JSON null for blank, scopes to footer band, forbids invented content).Suggested approach
_crop_footer_strip— a generatedPIL.Image.newwith a painted marker in the footer band; assert the crop captures it and excludes the body grid.FOOTER_WIRE_SCHEMA+FOOTER_EXTRACTION_PROMPT. Mirror header-call style: short prompt, page-scoped instructions, "do not transcribe row content from above the Comments line.".remote()call inside the existingwith app.run():block, between the header call and the quadrant loop (or after the loop — doesn't matter functionally, but placing it next to the header call keeps the "page-level fields" cluster together). Same exception-tolerant pattern as the header call.comments_raw=...on the constructedPageResult.The Modal container is warm for calls 2-6, so the cost delta is one warm forward pass per page (~5-10s, fractions of a cent). No new cold-start risk.
Constraints
Comments:line. Otherwise the model will helpfully transcribe the last row of the bottom quadrants intocomments_raw.nullfor blank;modal-qwen-vl-quadmust end up doing the same so calibration A/B's are apples-to-apples.Acceptance criteria
_crop_footer_striphelper exists with a unit test.FOOTER_EXTRACTION_PROMPTexists with contract tests parallel toHEADER_EXTRACTION_PROMPT.modal-qwen-vl-quadmakes 6 remote calls per page; the 6th populatesPageResult.comments_raw.PageResultwithcomments_rawset when the band has content,Nonewhen blank.Related
Follow-up to #31 (closed by #32). Sibling structural fix to #19 (comments-line boundary detection), which is what made adding the 6th call cheap —
layout.body_bottom_yalready separates the footer band from the body grid.