fix: Add fixes in Instance Segmentation (#1007)

msluszniak · mkopcins · web-flow · commit d0d3e5b7a1d4 · 2026-03-25T15:30:32.000+01:00
## Description

This PR adds spotted fixes to semantic segmentation. 

### Introduces a breaking change?

- [ ] Yes
- [x] No

### Type of change

- [x] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

### Tested on

- [ ] iOS
- [ ] Android

### Testing instructions

- [ ] Check than changes to documentations are aligned with TS
implementation.

### Screenshots

&lt;!-- Add screenshots here, if applicable --&gt;

### Related issues

&lt;!-- Link related issues here using #issue-number --&gt;

### Checklist

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings

### Additional notes

&lt;!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. --&gt;

---------

Co-authored-by: Mateusz Kopcinski &lt;120639731+mkopcins@users.noreply.github.com&gt;
diff --git a/docs/docs/03-hooks/02-computer-vision/useInstanceSegmentation.md b/docs/docs/03-hooks/02-computer-vision/useInstanceSegmentation.md
@@ -53,6 +53,8 @@ For more information on loading resources, take a look at [loading models](../..
 - `error` - An error object if the model failed to load or encountered a runtime error.
 - `downloadProgress` - A value between 0 and 1 representing the download progress of the model binary.
 - `forward` - A function to run inference on an image.
+- `getAvailableInputSizes` - Returns the available input sizes for the loaded model, or `undefined` if the model accepts only a single input size. Use this to populate UI controls for selecting the input resolution.
+- `runOnFrame` - A synchronous worklet function for real-time VisionCamera frame processing. See [VisionCamera Integration](./visioncamera-integration.md) for usage.
 
 ## Running the model
 
diff --git a/docs/docs/03-hooks/02-computer-vision/useObjectDetection.md b/docs/docs/03-hooks/02-computer-vision/useObjectDetection.md
@@ -62,6 +62,7 @@ You need more details? Check the following resources:
 - `downloadProgress` - A value between 0 and 1 representing the download progress of the model binary.
 - `forward` - A function to run inference on an image.
 - `getAvailableInputSizes` - A function that returns available input sizes for multi-method models (YOLO). Returns `undefined` for single-method models.
+- `runOnFrame` - A synchronous worklet function for real-time VisionCamera frame processing. See [VisionCamera Integration](./visioncamera-integration.md) for usage.
 
 ## Running the model
 
@@ -117,15 +118,15 @@ See the full guide: [VisionCamera Integration](./visioncamera-integration.md).
 
 ## Supported models
 
-| Model | Number of classes | Class list | Multi-size Support |
-| ----------------------------------------------------------------------------------------------------------------------------- | ----------------- | ------------------------------------------------------------ | ------------------ |
-| [SSDLite320 MobileNetV3 Large](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large) | 91 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) | No |
-| [RF-DETR Nano](https://huggingface.co/software-mansion/react-native-executorch-rf-detr-nano) | 80 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) | No |
-| [YOLO26N](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
-| [YOLO26S](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
-| [YOLO26M](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
-| [YOLO26L](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
-| [YOLO26X](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
+| Model                                                                                                                         | Number of classes | Class list                                                    | Multi-size Support |
+| ----------------------------------------------------------------------------------------------------------------------------- | ----------------- | ------------------------------------------------------------- | ------------------ |
+| [SSDLite320 MobileNetV3 Large](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large) | 91                | [COCO](../../06-api-reference/enumerations/CocoLabel.md)      | No                 |
+| [RF-DETR Nano](https://huggingface.co/software-mansion/react-native-executorch-rf-detr-nano)                                  | 80                | [COCO](../../06-api-reference/enumerations/CocoLabel.md)      | No                 |
+| [YOLO26N](https://huggingface.co/software-mansion/react-native-executorch-yolo26)                                             | 80                | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640)  |
+| [YOLO26S](https://huggingface.co/software-mansion/react-native-executorch-yolo26)                                             | 80                | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640)  |
+| [YOLO26M](https://huggingface.co/software-mansion/react-native-executorch-yolo26)                                             | 80                | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640)  |
+| [YOLO26L](https://huggingface.co/software-mansion/react-native-executorch-yolo26)                                             | 80                | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640)  |
+| [YOLO26X](https://huggingface.co/software-mansion/react-native-executorch-yolo26)                                             | 80                | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640)  |
 
 :::tip
 YOLO models support multiple input sizes (384px, 512px, 640px). Smaller sizes are faster but less accurate, while larger sizes are more accurate but slower. Choose based on your speed/accuracy requirements.
diff --git a/packages/react-native-executorch/src/hooks/computer_vision/useInstanceSegmentation.ts b/packages/react-native-executorch/src/hooks/computer_vision/useInstanceSegmentation.ts
@@ -13,10 +13,10 @@ import { useModuleFactory } from '../useModuleFactory';
  * React hook for managing an Instance Segmentation model instance.
  * @typeParam C - A {@link InstanceSegmentationModelSources} config specifying which model to load.
  * @param props - Configuration object containing `model` config and optional `preventLoad` flag.
- * @returns An object with model state (`error`, `isReady`, `isGenerating`, `downloadProgress`) and a typed `forward` function.
+ * @returns An object with model state (`error`, `isReady`, `isGenerating`, `downloadProgress`), a typed `forward` function, `getAvailableInputSizes` helper, and a `runOnFrame` worklet for VisionCamera integration.
  * @example
  * ```ts
- * const { isReady, isGenerating, forward, error, downloadProgress } =
+ * const { isReady, isGenerating, forward, error, downloadProgress, getAvailableInputSizes, runOnFrame } =
  *   useInstanceSegmentation({
  *     model: {
  *       modelName: 'yolo26n-seg',
diff --git a/packages/react-native-executorch/src/modules/computer_vision/InstanceSegmentationModule.ts b/packages/react-native-executorch/src/modules/computer_vision/InstanceSegmentationModule.ts
@@ -276,15 +276,14 @@ export class InstanceSegmentationModule<
    * Override runOnFrame to add label mapping for VisionCamera integration.
    * The parent's runOnFrame returns raw native results with class indices;
    * this override maps them to label strings and provides an options-based API.
-   * @returns A worklet function for VisionCamera frame processing, or null if the model is not loaded.
+   * @returns A worklet function for VisionCamera frame processing.
+   * @throws {RnExecutorchError} If the underlying native worklet is unavailable (should not occur on a loaded module).
    */
-  override get runOnFrame():
-    | ((
-        frame: Frame,
-        isFrontCamera: boolean,
-        options?: InstanceSegmentationOptions<ResolveLabels<T>>
-      ) => SegmentedInstance<ResolveLabels<T>>[])
-    | null {
+  override get runOnFrame(): (
+    frame: Frame,
+    isFrontCamera: boolean,
+    options?: InstanceSegmentationOptions<ResolveLabels<T>>
+  ) => SegmentedInstance<ResolveLabels<T>>[] {
     const baseRunOnFrame = super.runOnFrame;
     if (!baseRunOnFrame) {
       throw new RnExecutorchError(
diff --git a/packages/react-native-executorch/src/modules/computer_vision/ObjectDetectionModule.ts b/packages/react-native-executorch/src/modules/computer_vision/ObjectDetectionModule.ts
@@ -1,4 +1,9 @@
-import { LabelEnum, PixelData, ResourceSource } from '../../types/common';
+import {
+  Frame,
+  LabelEnum,
+  PixelData,
+  ResourceSource,
+} from '../../types/common';
 import {
   Detection,
   ObjectDetectionConfig,
@@ -144,15 +149,14 @@ export class ObjectDetectionModule<
 
   /**
    * Override runOnFrame to provide an options-based API for VisionCamera integration.
-   * @returns A worklet function for frame processing, or null if the model is not loaded.
+   * @returns A worklet function for frame processing.
+   * @throws {RnExecutorchError} If the underlying native worklet is unavailable (should not occur on a loaded module).
    */
-  override get runOnFrame():
-    | ((
-        frame: any,
-        isFrontCamera: boolean,
-        options?: ObjectDetectionOptions<ResolveLabels<T>>
-      ) => Detection<ResolveLabels<T>>[])
-    | null {
+  override get runOnFrame(): (
+    frame: Frame,
+    isFrontCamera: boolean,
+    options?: ObjectDetectionOptions<ResolveLabels<T>>
+  ) => Detection<ResolveLabels<T>>[] {
     const baseRunOnFrame = super.runOnFrame;
     if (!baseRunOnFrame) {
       throw new RnExecutorchError(
diff --git a/packages/react-native-executorch/src/modules/computer_vision/VisionModule.ts b/packages/react-native-executorch/src/modules/computer_vision/VisionModule.ts
@@ -32,7 +32,7 @@ export abstract class VisionModule<TOutput> extends BaseModule {
   /**
    * Synchronous worklet function for real-time VisionCamera frame processing.
    *
-   * Only available after the model is loaded. Returns null if not loaded.
+   * Only available after the model is loaded.
    *
    * **Use this for VisionCamera frame processing in worklets.**
    * For async processing, use `forward()` instead.
@@ -55,9 +55,10 @@ export abstract class VisionModule<TOutput> extends BaseModule {
    *   }
    * });
    * ```
-   * @returns A worklet function for frame processing, or null if the model is not loaded.
+   * @returns A worklet function for frame processing.
+   * @throws {RnExecutorchError} If the model is not loaded.
    */
-  get runOnFrame(): ((frame: Frame, ...args: any[]) => TOutput) | null {
+  get runOnFrame(): (frame: Frame, ...args: any[]) => TOutput {
     if (!this.nativeModule) {
       throw new RnExecutorchError(
         RnExecutorchErrorCode.ModuleNotLoaded,
diff --git a/packages/react-native-executorch/src/types/instanceSegmentation.ts b/packages/react-native-executorch/src/types/instanceSegmentation.ts
@@ -196,7 +196,9 @@ export interface InstanceSegmentationType<L extends LabelEnum> {
    * **Use this for VisionCamera frame processing in worklets.**
    * For async processing, use `forward()` instead.
    *
-   * Available after model is loaded (`isReady: true`).
+   * `null` until the model is ready (`isReady: true`). The property itself is
+   * `null` when the model has not loaded yet — the function always returns an
+   * array (never `null`) once called.
    * @param frame - VisionCamera Frame object
    * @param isFrontCamera - Whether the front camera is active (for mirroring correction).
    * @param options - Optional configuration for the segmentation process.
diff --git a/packages/react-native-executorch/src/types/objectDetection.ts b/packages/react-native-executorch/src/types/objectDetection.ts
@@ -6,10 +6,10 @@ export { CocoLabel };
 /**
  * Represents a bounding box for a detected object in an image.
  * @category Types
- * @property {number} x1 - The x-coordinate of the bottom-left corner of the bounding box.
- * @property {number} y1 - The y-coordinate of the bottom-left corner of the bounding box.
- * @property {number} x2 - The x-coordinate of the top-right corner of the bounding box.
- * @property {number} y2 - The y-coordinate of the top-right corner of the bounding box.
+ * @property {number} x1 - The x-coordinate of the top-left corner of the bounding box.
+ * @property {number} y1 - The y-coordinate of the top-left corner of the bounding box.
+ * @property {number} x2 - The x-coordinate of the bottom-right corner of the bounding box.
+ * @property {number} y2 - The y-coordinate of the bottom-right corner of the bounding box.
  */
 export interface Bbox {
   x1: number;