rtmlib-ts

Real-time Multi-Person Pose Estimation & Object Detection Library

TypeScript port of rtmlib with YOLO12 and MediaPipe support for browser-based AI inference.

🚀 Features

🎯 Object Detection - 80 COCO classes with YOLO12n or MediaPipe EfficientDet
🧘 Pose Estimation (2D) - 17 keypoints (COCO) with RTMW or 33 keypoints with MediaPipe BlazePose
🎯 Pose Estimation (3D) - Full 3D pose with Z-coordinates in meters using RTMW3D-X
🐾 Animal Detection - 30 animal species with ViTPose++ pose estimation
🎮 MediaPipe Integration - TFLite backend for faster inference
⚡ Fastest Combo - MediaPipe + RTMW3D for 2-3x faster 3D pose estimation
📹 Video Support - Real-time camera & video file processing
🌐 Browser-based - Pure WebAssembly/WebGL/WebGPU, no backend required
⚡ Fast - Optimized for ~200ms inference (416×416)
🎨 Beautiful UI - Modern gradient design in playground

📦 Installation

npm install rtmlib-ts

🎮 Quick Start

1. Try the Playground

cd rtmlib-playground-main
npm install
npm run dev

# Open http://localhost:3000

2. Object Detection (YOLO)

import { ObjectDetector, drawResultsOnCanvas } from 'rtmlib-ts';

const detector = new ObjectDetector({
  model: 'https://huggingface.co/demon2233/rtmlib-ts/resolve/main/yolo/yolov12n.onnx',
  classes: ['person', 'car', 'dog'],
  confidence: 0.5,
  inputSize: [416, 416],
  backend: 'webgl',
});
await detector.init();

const results = await detector.detectFromCanvas(canvas);
drawResultsOnCanvas(ctx, results, 'object');

3. Object Detection (MediaPipe - FASTER!)

import { ObjectDetector } from 'rtmlib-ts';

const detector = new ObjectDetector({
  detectorType: 'mediapipe',
  mediaPipeModelPath: 'https://storage.googleapis.com/mediapipe-models/object_detector/efficientdet_lite0/int8/latest/efficientdet_lite0.tflite',
  mediaPipeScoreThreshold: 0.5,
  classes: ['person', 'car'],
});
await detector.init();

const results = await detector.detectFromCanvas(canvas);

4. Pose Estimation (2D)

import { PoseDetector, drawResultsOnCanvas } from 'rtmlib-ts';

const detector = new PoseDetector({
  detModel: 'https://huggingface.co/demon2233/rtmlib-ts/resolve/main/yolo/yolov12n.onnx',
  poseModel: 'https://huggingface.co/demon2233/rtmlib-ts/resolve/main/rtmpose/end2end.onnx',
  detInputSize: [416, 416],
  poseInputSize: [384, 288],
  detConfidence: 0.5,
  poseConfidence: 0.3,
  backend: 'webgl',
});
await detector.init();

const poses = await detector.detectFromCanvas(canvas);
drawResultsOnCanvas(ctx, poses, 'pose');

5. Pose Estimation (3D) - FASTEST with MediaPipe!

import { MediaPipeObject3DPoseDetector } from 'rtmlib-ts';

// MediaPipe + RTMW3D = 2-3x faster than YOLO+3D!
const detector = new MediaPipeObject3DPoseDetector({
  mpScoreThreshold: 0.5,
  poseConfidence: 0.3,
  backend: 'webgpu',
  personsOnly: true,
});
await detector.init();

const result = await detector.detectFromCanvas(canvas);
console.log(result.keypoints[0][0]); // [x, y, z] in meters

6. Animal Detection

import { AnimalDetector } from 'rtmlib-ts';

const detector = new AnimalDetector({
  poseModelType: 'vitpose-b',
  classes: ['dog', 'cat', 'horse'],
  detConfidence: 0.5,
  poseConfidence: 0.3,
  backend: 'webgl',
});
await detector.init();

const animals = await detector.detectFromCanvas(canvas);

📊 Performance

Model	Input	Time	Use Case
YOLO12n	416×416	~200ms	Real-time video
YOLO12n	640×640	~500ms	High accuracy
MediaPipe EfficientDet	320×320	~100ms	Fast detection
RTMW Pose	384×288	~100ms	Per person
MediaPipe + RTMW3D	320×320 + 384×288	~150ms	Fastest 3D pose!

Optimization Tips:

Use 416×416 for video/real-time
Use 640×640 for static images
MediaPipe + RTMW3D for fastest 3D pose estimation
First run is slower (WASM compilation)
Filter classes to reduce processing
Use backend: 'webgpu' for GPU acceleration

🎯 Supported Classes (COCO 80)

Common: person, car, dog, cat, bicycle, bus, truck
Objects: bottle, chair, couch, potted plant
Animals: bird, horse, sheep, cow, elephant
Full list: See COCO_CLASSES export or use class selector in playground

🐾 Animal Detection (30 Species)

Supported: dog, cat, horse, zebra, elephant, tiger, lion, panda, cow, sheep, bird, and more!

🎨 Drawing Utilities

import {
  drawDetectionsOnCanvas,
  drawPoseOnCanvas,
  drawResultsOnCanvas
} from 'rtmlib-ts';

drawResultsOnCanvas(ctx, results, 'object');  // or 'pose', 'pose3d'

📁 Project Structure

rtmlib-ts/
├── src/
│   ├── core/                    # Base utilities
│   │   ├── base.ts              # BaseTool class
│   │   ├── modelCache.ts        # Model caching
│   │   └── preprocessing.ts     # Image preprocessing
│   ├── models/                  # Model implementations
│   │   ├── yolo12.ts            # YOLO12 detector
│   │   ├── rtmpose.ts           # RTMPose model
│   │   └── rtmpose3d.ts         # 3D Pose model
│   ├── solution/                # High-level APIs
│   │   ├── objectDetector.ts    # ObjectDetector (80 COCO)
│   │   ├── poseDetector.ts      # PoseDetector (YOLO + RTMW)
│   │   ├── pose3dDetector.ts    # Pose3DDetector
│   │   ├── animalDetector.ts    # AnimalDetector (ViTPose)
│   │   ├── mediaPipeObjectDetector.ts  # MediaPipe Object Detection
│   │   ├── mediaPipePoseDetector.ts    # MediaPipe Pose Landmarker
│   │   └── mediaPipeObject3DPoseDetector.ts  # MediaPipe + RTMW3D
│   ├── types/                   # TypeScript types
│   └── visualization/           # Canvas drawing
├── docs/                        # API documentation
├── rtmlib-playground-main/      # Next.js demo app
└── README.md

🧩 Detector Types

Object Detection

YOLO - YOLO12n ONNX model (accurate)
MediaPipe - EfficientDet TFLite model (fast)

Pose Estimation

YOLO + RTMW - YOLO12 + RTMWpose (accurate 2D)
MediaPipe - BlazePose with 33 keypoints (fast 2D)
YOLO + RTMW3D - YOLO12 + RTMW3D-X (accurate 3D)
MediaPipe + RTMW3D - EfficientDet + RTMW3D-X (⚡ fastest 3D!)

Animal Detection

ViTPose-S/B/L - Small/Base/Large models for 30 animal species

🐛 Known Issues

YOLOv26n: Requires model re-export (format mismatch)
First run: Slow due to WASM compilation
Mobile: Performance varies by device
WebGPU: Requires browser support (Chrome 113+)

📝 License

Apache 2.0

🙏 Credits

Based on rtmlib by Tao Jiang
YOLO12 by Ultralytics
RTMW by OpenMMLab
MediaPipe by Google

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rtmlib-ts

🚀 Features

📦 Installation

🎮 Quick Start

1. Try the Playground

2. Object Detection (YOLO)

3. Object Detection (MediaPipe - FASTER!)

4. Pose Estimation (2D)

5. Pose Estimation (3D) - FASTEST with MediaPipe!

6. Animal Detection

📊 Performance

🎯 Supported Classes (COCO 80)

🐾 Animal Detection (30 Species)

🎨 Drawing Utilities

📁 Project Structure

🧩 Detector Types

Object Detection

Pose Estimation

Animal Detection

🐛 Known Issues

📝 License

🙏 Credits

📚 Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rtmlib-ts

🚀 Features

📦 Installation

🎮 Quick Start

1. Try the Playground

2. Object Detection (YOLO)

3. Object Detection (MediaPipe - FASTER!)

4. Pose Estimation (2D)

5. Pose Estimation (3D) - FASTEST with MediaPipe!

6. Animal Detection

📊 Performance

🎯 Supported Classes (COCO 80)

🐾 Animal Detection (30 Species)

🎨 Drawing Utilities

📁 Project Structure

🧩 Detector Types

Object Detection

Pose Estimation

Animal Detection

🐛 Known Issues

📝 License

🙏 Credits

📚 Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages