Depth estimation model are usually used to approximate the relative distance of every pixel in an image from the camera, also known as depth.
X-AnyLabeling offers a range of depth models for using, including Depth Anything V1 and Depth Anything V2.
- Depth Anything V1 is a highly practical solution for robust monocular depth estimation by training on a combination of 1.5M labeled images and 62M+ unlabeled images.
- Depth Anything V2 significantly outperforms its predecessor, V1, in terms of fine-grained detail and robustness. In comparison to SD-based models, V2 boasts faster inference speed, a reduced number of parameters, and enhanced depth accuracy.
girl.mov
- Import your image (
Ctrl+I) or video (Ctrl+O) file into the X-AnyLabeling. - Select and load the Depth-Anything related model, or choose from other available depth estimation models.
- Initiate the process by clicking
Run (i). Once you've verified that everything is set up correctly, use the keyboard shortcutCtrl+Mto process all images in one go.
The output, once completed, will be automatically stored in a x-anylabeling-depth subdirectory within the same folder as your original image.
Tip
Two output modes are supported: grayscale and color. You can switch between these modes by modifying the render_mode parameter in the respective configuration file.
By default, depth estimation models output relative depth (normalized 0-1 values) which only indicates which areas are closer or farther. To convert these to real-world distances, you can enable depth calibration by adding the following parameters to your configuration file:
min_depth: 0.5 # Minimum distance in meters
max_depth: 20.0 # Maximum distance in meters
save_raw_depth: true # Save calibrated depth as .npy fileExample Configuration:
type: depth_anything_v2
name: depth_anything_v2_vit_b
display_name: Depth-Anything-V2-Base
model_path: depth_anything_v2_vitb.onnx
render_mode: color
min_depth: 1.0
max_depth: 50.0
save_raw_depth: trueWhen enabled, the output will include:
- Visualization image: Color or grayscale heatmap (same as before)
*_depth.npyfile: Calibrated depth map with real-world distances in meters
You can load and query the calibrated depth data using:
import numpy as np
depth_map = np.load("image_depth.npy")
distance = depth_map[y, x] # Get distance at pixel (x, y) in metersNote
The min_depth and max_depth values should be set according to your actual scene. For indoor scenes, typical values might be 0.5-10m, while outdoor scenes might use 5-100m. Leave these parameters unset to use default visualization-only mode.


