Skip to content

Embed image for Virchow2#6

Merged
Jurgee merged 14 commits into
mainfrom
feature/embed-image
Mar 22, 2026
Merged

Embed image for Virchow2#6
Jurgee merged 14 commits into
mainfrom
feature/embed-image

Conversation

@Jurgee
Copy link
Copy Markdown
Collaborator

@Jurgee Jurgee commented Mar 14, 2026

Summary by CodeRabbit

  • New Features

    • Added image embedding to generate vector representations from images (synchronous and asynchronous).
    • Embeddings can be returned in selectable numeric precision (e.g., float16/float32).
  • Chores / API Changes

    • Image classification now requires explicit model and image inputs and may return a single score or a mapping of labels to scores.
  • Unchanged

    • Image segmentation behavior remains the same.

Copilot AI review requested due to automatic review settings March 14, 2026 16:29
@Jurgee Jurgee self-assigned this Mar 14, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the rationai library by introducing image embedding capabilities. It provides both synchronous and asynchronous methods to generate embedding vectors from images, allowing users to leverage specified models for advanced image analysis and feature extraction. This expansion improves the library's utility for machine learning applications involving visual data.

Highlights

  • New Feature: Image Embedding: Introduced embed_image methods in both Models and AsyncModels classes to compute embedding vectors for images using a specified model.
  • Type Hinting Enhancements: Added Any and Literal imports from the typing module to support more precise type annotations for the new embedding functionality.
Changelog
  • rationai/resources/models.py
    • Added embed_image method to the Models class for synchronous image embedding.
    • Added async embed_image method to the AsyncModels class for asynchronous image embedding.
    • Imported Any and Literal from the typing module to support new type annotations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds _parse_embedding_response and synchronous/asynchronous embed_image methods to Models/AsyncModels that convert images to uint8, LZ4-compress and POST them to a model endpoint, parse embeddings into a NumPy array cast to a requested dtype, and updates classify_image signatures and typings.

Changes

Cohort / File(s) Summary
Model image APIs & typing
rationai/resources/models.py
Added _parse_embedding_response; added embed_image() (sync) and embed_image() (async) that convert PIL/NDArray images to uint8, LZ4-compress, POST to model endpoint, validate and parse response, and return NDArray cast to output_dtype. Updated classify_image signatures to require model and image and adjusted return typing. Added/updated imports and typing annotations (Response, DTypeLike, NDArray, etc.).

Sequence Diagram(s)

sequenceDiagram
  participant Caller as "Caller"
  participant Models as "Models / AsyncModels"
  participant Converter as "Image -> uint8 bytes"
  participant Compressor as "LZ4 Compressor"
  participant ModelAPI as "Model API (POST)"
  participant Parser as "_parse_embedding_response"

  Caller->>Models: embed_image(model, image, output_dtype, timeout)
  Models->>Converter: convert PIL/NDArray -> uint8 bytes
  Converter->>Compressor: compress bytes (LZ4)
  Compressor->>ModelAPI: POST compressed payload
  ModelAPI-->>Models: response (embedding bytes)
  Models->>Parser: parse response, cast to output_dtype
  Parser-->>Caller: NDArray (embedding)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble pixels into tidy bites,

LZ4 wraps them snug for network flights,
Sync or await — embeddings hop anew,
Floats or int8, I cast them true,
🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Embed image for Virchow2' directly aligns with the main change: adding embed_image methods to the Models and AsyncModels classes for image embedding functionality.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/embed-image
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces embed_image methods to both the synchronous Models and asynchronous AsyncModels classes, enabling image embedding functionality. The implementation is consistent with existing methods in the file. My review includes suggestions to improve input validation for the output_dtype parameter to prevent unexpected behavior with invalid inputs.

Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an image-embedding API method to the Models and AsyncModels resources so clients can request embedding vectors (e.g., for Virchow2) using the existing models service transport patterns.

Changes:

  • Added embed_image() to Models (sync) to POST an lz4-compressed image and parse embeddings into a NumPy array.
  • Added embed_image() to AsyncModels (async) with the same behavior.
  • Added typing imports to support the new method’s type signatures (Any, Literal).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
rationai/resources/models.py (1)

85-86: Enforce the documented 1-D embedding contract.

Line 85 and Line 166 document a 1-D embedding vector, but the current return path accepts any JSON shape. Consider validating ndim == 1 before returning.

💡 Proposed fix
-        return np.array(response.json(), dtype=np_dtype)
+        embedding = np.asarray(response.json(), dtype=np_dtype)
+        if embedding.ndim != 1:
+            raise ValueError(f"Expected 1-D embedding, got shape {embedding.shape}")
+        return embedding
@@
-        return np.array(response.json(), dtype=np_dtype)
+        embedding = np.asarray(response.json(), dtype=np_dtype)
+        if embedding.ndim != 1:
+            raise ValueError(f"Expected 1-D embedding, got shape {embedding.shape}")
+        return embedding

Also applies to: 92-93, 166-167, 173-173

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rationai/resources/models.py` around lines 85 - 86, The code documents
embeddings as a 1-D numpy array (NDArray[np.floating[Any]]) but currently
returns any JSON shape; update the embedding-return path(s) that produce/parse
the embedding (the functions documented around the NDArray return types) to
validate that the numpy array has ndim == 1 before returning, and raise a clear
ValueError (e.g., "embedding must be 1-D, got ndim=X, shape=Y") if not; add this
check in every location that returns the embedding (the functions/methods
documented at the NDArray return lines) and add a small unit test asserting that
non-1D input raises the error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rationai/resources/models.py`:
- Around line 73-74: The code currently silently coerces unknown output_dtype
values to np.float32; change this to explicit validation: in any function or
class constructor that accepts the output_dtype parameter (named output_dtype)
validate it against the allowed set {"float16","float32"} and if it is not one
of these values raise a ValueError with a clear message; then map the validated
string to the numpy dtype via a small dict (e.g. {"float16": np.float16,
"float32": np.float32}) instead of using a default fallback, and update all code
paths that currently fallback to np.float32 to use this validated mapping
(search for uses of output_dtype and the implicit np.float32 fallback and
replace with the validator + mapping).

---

Nitpick comments:
In `@rationai/resources/models.py`:
- Around line 85-86: The code documents embeddings as a 1-D numpy array
(NDArray[np.floating[Any]]) but currently returns any JSON shape; update the
embedding-return path(s) that produce/parse the embedding (the functions
documented around the NDArray return types) to validate that the numpy array has
ndim == 1 before returning, and raise a clear ValueError (e.g., "embedding must
be 1-D, got ndim=X, shape=Y") if not; add this check in every location that
returns the embedding (the functions/methods documented at the NDArray return
lines) and add a small unit test asserting that non-1D input raises the error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ccf814b0-7849-4149-b178-7753d77e0e6d

📥 Commits

Reviewing files that changed from the base of the PR and between c701b16 and eedf55d.

📒 Files selected for processing (1)
  • rationai/resources/models.py

Comment thread rationai/resources/models.py Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@Jurgee Jurgee requested review from a team, JakubPekar and ejdam87 March 14, 2026 16:33
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Comment thread rationai/resources/models.py Outdated
@matejpekar matejpekar requested review from Adames4 and removed request for JakubPekar and ejdam87 March 15, 2026 16:27
@Jurgee Jurgee requested a review from matejpekar March 15, 2026 18:33
Adames4
Adames4 previously approved these changes Mar 17, 2026
Comment thread rationai/resources/models.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
rationai/resources/models.py (1)

98-107: Extract shared embed request/parse logic to reduce sync/async drift.

The serialization/header/parsing flow is duplicated in both methods. A small shared helper would reduce maintenance risk and keep behavior consistent.

Also applies to: 183-192

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rationai/resources/models.py` around lines 98 - 107, The image serialization,
header setup, request posting and response parsing logic duplicated between the
sync and async embedding flows should be extracted into a shared helper; create
a helper (e.g., _prepare_and_post_embedding or _send_embedding_request) that
accepts the same inputs used in both places (self, model, image, output_dtype,
timeout) and performs np.asarray(..., dtype=np.uint8), lz4.frame.compress(...
.tobytes()), sets headers={"x-output-dtype": np.dtype(output_dtype).name}, posts
via the appropriate internal requester, and calls _parse_embedding_response to
return the parsed result; update both the sync code path that calls self._post
and the async path (the corresponding async post caller around lines ~183-192)
to delegate to this shared helper to eliminate duplication and keep behavior
consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rationai/resources/models.py`:
- Around line 16-17: The annotated return type NDArray[np.floating[Any]] is
incorrect for functions that accept a generic output_dtype; update the return
annotations on the functions that take the parameter output_dtype (the three
signatures shown that currently end with -> NDArray[np.floating[Any]] ) to a
generic NDArray[Any] so the type matches whatever dtype is requested at runtime,
and ensure the necessary typing imports (Any, NDArray) are present or adjusted
accordingly.
- Around line 98-99: Add explicit input validation before the np.asarray(image,
dtype=np.uint8) call: detect if `image` is a PIL Image (isinstance(image,
Image.Image)) and ensure image.mode is an expected mode (e.g., 'RGB' or 'L') or
else raise a clear ValueError advising the caller to convert (e.g.,
image.convert('RGB')). If `image` is a numpy array, check its dtype
(image.dtype) and raise a ValueError if it is not np.uint8 instead of silently
coercing; alternatively, perform an explicit, documented conversion step (with a
clear log or comment) before creating `image_array`. Apply this validation
around the `image_array = np.asarray(image, dtype=np.uint8)` and
`compressed_data = lz4.frame.compress(image_array.tobytes())` sites so callers
get a helpful error instead of silent data truncation.

---

Nitpick comments:
In `@rationai/resources/models.py`:
- Around line 98-107: The image serialization, header setup, request posting and
response parsing logic duplicated between the sync and async embedding flows
should be extracted into a shared helper; create a helper (e.g.,
_prepare_and_post_embedding or _send_embedding_request) that accepts the same
inputs used in both places (self, model, image, output_dtype, timeout) and
performs np.asarray(..., dtype=np.uint8), lz4.frame.compress(... .tobytes()),
sets headers={"x-output-dtype": np.dtype(output_dtype).name}, posts via the
appropriate internal requester, and calls _parse_embedding_response to return
the parsed result; update both the sync code path that calls self._post and the
async path (the corresponding async post caller around lines ~183-192) to
delegate to this shared helper to eliminate duplication and keep behavior
consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db7239bb-ebbb-4c6b-bee7-24c304343631

📥 Commits

Reviewing files that changed from the base of the PR and between e6bd98a and b4b58de.

📒 Files selected for processing (1)
  • rationai/resources/models.py

Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
@Jurgee Jurgee requested a review from matejpekar March 18, 2026 11:26
Comment thread rationai/resources/models.py
@Jurgee Jurgee requested a review from matejpekar March 18, 2026 19:41
Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
Comment thread rationai/resources/models.py Outdated
@Jurgee Jurgee requested a review from matejpekar March 19, 2026 14:56
Comment thread rationai/resources/models.py Outdated
@matejpekar
Copy link
Copy Markdown
Member

@Jurgee always resolve the comments that are no more relevant before requesting another review!

@Jurgee Jurgee closed this Mar 19, 2026
@Jurgee Jurgee reopened this Mar 19, 2026
@Jurgee
Copy link
Copy Markdown
Collaborator Author

Jurgee commented Mar 19, 2026

I accidentally closed this PR. Sorry about that. PR is open right now

@Jurgee Jurgee requested a review from matejpekar March 19, 2026 20:22
Comment thread rationai/resources/models.py Outdated
@Jurgee Jurgee requested a review from matejpekar March 19, 2026 21:21
@Jurgee Jurgee merged commit 5e2d64c into main Mar 22, 2026
4 checks passed
@Jurgee Jurgee deleted the feature/embed-image branch March 22, 2026 21:28
@coderabbitai coderabbitai Bot mentioned this pull request Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants