Skip to content

Commit d0d3e5b

Browse files
msluszniakmkopcins
andauthored
fix: Add fixes in Instance Segmentation (#1007)
## Description This PR adds spotted fixes to semantic segmentation. ### Introduces a breaking change? - [ ] Yes - [x] No ### Type of change - [x] Bug fix (change which fixes an issue) - [ ] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [ ] Other (chores, tests, code style improvements etc.) ### Tested on - [ ] iOS - [ ] Android ### Testing instructions - [ ] Check than changes to documentations are aligned with TS implementation. ### Screenshots <!-- Add screenshots here, if applicable --> ### Related issues <!-- Link related issues here using #issue-number --> ### Checklist - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have updated the documentation accordingly - [x] My changes generate no new warnings ### Additional notes <!-- Include any additional information, assumptions, or context that reviewers might need to understand this PR. --> --------- Co-authored-by: Mateusz Kopcinski <120639731+mkopcins@users.noreply.github.com>
1 parent 8222b6e commit d0d3e5b

8 files changed

Lines changed: 45 additions & 36 deletions

File tree

docs/docs/03-hooks/02-computer-vision/useInstanceSegmentation.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,8 @@ For more information on loading resources, take a look at [loading models](../..
5353
- `error` - An error object if the model failed to load or encountered a runtime error.
5454
- `downloadProgress` - A value between 0 and 1 representing the download progress of the model binary.
5555
- `forward` - A function to run inference on an image.
56+
- `getAvailableInputSizes` - Returns the available input sizes for the loaded model, or `undefined` if the model accepts only a single input size. Use this to populate UI controls for selecting the input resolution.
57+
- `runOnFrame` - A synchronous worklet function for real-time VisionCamera frame processing. See [VisionCamera Integration](./visioncamera-integration.md) for usage.
5658

5759
## Running the model
5860

docs/docs/03-hooks/02-computer-vision/useObjectDetection.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ You need more details? Check the following resources:
6262
- `downloadProgress` - A value between 0 and 1 representing the download progress of the model binary.
6363
- `forward` - A function to run inference on an image.
6464
- `getAvailableInputSizes` - A function that returns available input sizes for multi-method models (YOLO). Returns `undefined` for single-method models.
65+
- `runOnFrame` - A synchronous worklet function for real-time VisionCamera frame processing. See [VisionCamera Integration](./visioncamera-integration.md) for usage.
6566

6667
## Running the model
6768

@@ -117,15 +118,15 @@ See the full guide: [VisionCamera Integration](./visioncamera-integration.md).
117118

118119
## Supported models
119120

120-
| Model | Number of classes | Class list | Multi-size Support |
121-
| ----------------------------------------------------------------------------------------------------------------------------- | ----------------- | ------------------------------------------------------------ | ------------------ |
122-
| [SSDLite320 MobileNetV3 Large](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large) | 91 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) | No |
123-
| [RF-DETR Nano](https://huggingface.co/software-mansion/react-native-executorch-rf-detr-nano) | 80 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) | No |
124-
| [YOLO26N](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
125-
| [YOLO26S](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
126-
| [YOLO26M](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
127-
| [YOLO26L](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
128-
| [YOLO26X](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
121+
| Model | Number of classes | Class list | Multi-size Support |
122+
| ----------------------------------------------------------------------------------------------------------------------------- | ----------------- | ------------------------------------------------------------- | ------------------ |
123+
| [SSDLite320 MobileNetV3 Large](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large) | 91 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) | No |
124+
| [RF-DETR Nano](https://huggingface.co/software-mansion/react-native-executorch-rf-detr-nano) | 80 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) | No |
125+
| [YOLO26N](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
126+
| [YOLO26S](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
127+
| [YOLO26M](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
128+
| [YOLO26L](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
129+
| [YOLO26X](https://huggingface.co/software-mansion/react-native-executorch-yolo26) | 80 | [COCO YOLO](../../06-api-reference/enumerations/CocoLabel.md) | Yes (384/512/640) |
129130

130131
:::tip
131132
YOLO models support multiple input sizes (384px, 512px, 640px). Smaller sizes are faster but less accurate, while larger sizes are more accurate but slower. Choose based on your speed/accuracy requirements.

packages/react-native-executorch/src/hooks/computer_vision/useInstanceSegmentation.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ import { useModuleFactory } from '../useModuleFactory';
1313
* React hook for managing an Instance Segmentation model instance.
1414
* @typeParam C - A {@link InstanceSegmentationModelSources} config specifying which model to load.
1515
* @param props - Configuration object containing `model` config and optional `preventLoad` flag.
16-
* @returns An object with model state (`error`, `isReady`, `isGenerating`, `downloadProgress`) and a typed `forward` function.
16+
* @returns An object with model state (`error`, `isReady`, `isGenerating`, `downloadProgress`), a typed `forward` function, `getAvailableInputSizes` helper, and a `runOnFrame` worklet for VisionCamera integration.
1717
* @example
1818
* ```ts
19-
* const { isReady, isGenerating, forward, error, downloadProgress } =
19+
* const { isReady, isGenerating, forward, error, downloadProgress, getAvailableInputSizes, runOnFrame } =
2020
* useInstanceSegmentation({
2121
* model: {
2222
* modelName: 'yolo26n-seg',

packages/react-native-executorch/src/modules/computer_vision/InstanceSegmentationModule.ts

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -276,15 +276,14 @@ export class InstanceSegmentationModule<
276276
* Override runOnFrame to add label mapping for VisionCamera integration.
277277
* The parent's runOnFrame returns raw native results with class indices;
278278
* this override maps them to label strings and provides an options-based API.
279-
* @returns A worklet function for VisionCamera frame processing, or null if the model is not loaded.
279+
* @returns A worklet function for VisionCamera frame processing.
280+
* @throws {RnExecutorchError} If the underlying native worklet is unavailable (should not occur on a loaded module).
280281
*/
281-
override get runOnFrame():
282-
| ((
283-
frame: Frame,
284-
isFrontCamera: boolean,
285-
options?: InstanceSegmentationOptions<ResolveLabels<T>>
286-
) => SegmentedInstance<ResolveLabels<T>>[])
287-
| null {
282+
override get runOnFrame(): (
283+
frame: Frame,
284+
isFrontCamera: boolean,
285+
options?: InstanceSegmentationOptions<ResolveLabels<T>>
286+
) => SegmentedInstance<ResolveLabels<T>>[] {
288287
const baseRunOnFrame = super.runOnFrame;
289288
if (!baseRunOnFrame) {
290289
throw new RnExecutorchError(

packages/react-native-executorch/src/modules/computer_vision/ObjectDetectionModule.ts

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
1-
import { LabelEnum, PixelData, ResourceSource } from '../../types/common';
1+
import {
2+
Frame,
3+
LabelEnum,
4+
PixelData,
5+
ResourceSource,
6+
} from '../../types/common';
27
import {
38
Detection,
49
ObjectDetectionConfig,
@@ -144,15 +149,14 @@ export class ObjectDetectionModule<
144149

145150
/**
146151
* Override runOnFrame to provide an options-based API for VisionCamera integration.
147-
* @returns A worklet function for frame processing, or null if the model is not loaded.
152+
* @returns A worklet function for frame processing.
153+
* @throws {RnExecutorchError} If the underlying native worklet is unavailable (should not occur on a loaded module).
148154
*/
149-
override get runOnFrame():
150-
| ((
151-
frame: any,
152-
isFrontCamera: boolean,
153-
options?: ObjectDetectionOptions<ResolveLabels<T>>
154-
) => Detection<ResolveLabels<T>>[])
155-
| null {
155+
override get runOnFrame(): (
156+
frame: Frame,
157+
isFrontCamera: boolean,
158+
options?: ObjectDetectionOptions<ResolveLabels<T>>
159+
) => Detection<ResolveLabels<T>>[] {
156160
const baseRunOnFrame = super.runOnFrame;
157161
if (!baseRunOnFrame) {
158162
throw new RnExecutorchError(

packages/react-native-executorch/src/modules/computer_vision/VisionModule.ts

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ export abstract class VisionModule<TOutput> extends BaseModule {
3232
/**
3333
* Synchronous worklet function for real-time VisionCamera frame processing.
3434
*
35-
* Only available after the model is loaded. Returns null if not loaded.
35+
* Only available after the model is loaded.
3636
*
3737
* **Use this for VisionCamera frame processing in worklets.**
3838
* For async processing, use `forward()` instead.
@@ -55,9 +55,10 @@ export abstract class VisionModule<TOutput> extends BaseModule {
5555
* }
5656
* });
5757
* ```
58-
* @returns A worklet function for frame processing, or null if the model is not loaded.
58+
* @returns A worklet function for frame processing.
59+
* @throws {RnExecutorchError} If the model is not loaded.
5960
*/
60-
get runOnFrame(): ((frame: Frame, ...args: any[]) => TOutput) | null {
61+
get runOnFrame(): (frame: Frame, ...args: any[]) => TOutput {
6162
if (!this.nativeModule) {
6263
throw new RnExecutorchError(
6364
RnExecutorchErrorCode.ModuleNotLoaded,

packages/react-native-executorch/src/types/instanceSegmentation.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,9 @@ export interface InstanceSegmentationType<L extends LabelEnum> {
196196
* **Use this for VisionCamera frame processing in worklets.**
197197
* For async processing, use `forward()` instead.
198198
*
199-
* Available after model is loaded (`isReady: true`).
199+
* `null` until the model is ready (`isReady: true`). The property itself is
200+
* `null` when the model has not loaded yet — the function always returns an
201+
* array (never `null`) once called.
200202
* @param frame - VisionCamera Frame object
201203
* @param isFrontCamera - Whether the front camera is active (for mirroring correction).
202204
* @param options - Optional configuration for the segmentation process.

packages/react-native-executorch/src/types/objectDetection.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@ export { CocoLabel };
66
/**
77
* Represents a bounding box for a detected object in an image.
88
* @category Types
9-
* @property {number} x1 - The x-coordinate of the bottom-left corner of the bounding box.
10-
* @property {number} y1 - The y-coordinate of the bottom-left corner of the bounding box.
11-
* @property {number} x2 - The x-coordinate of the top-right corner of the bounding box.
12-
* @property {number} y2 - The y-coordinate of the top-right corner of the bounding box.
9+
* @property {number} x1 - The x-coordinate of the top-left corner of the bounding box.
10+
* @property {number} y1 - The y-coordinate of the top-left corner of the bounding box.
11+
* @property {number} x2 - The x-coordinate of the bottom-right corner of the bounding box.
12+
* @property {number} y2 - The y-coordinate of the bottom-right corner of the bounding box.
1313
*/
1414
export interface Bbox {
1515
x1: number;

0 commit comments

Comments
 (0)