YOLOE (You Only Look Once, Extended) brings state-of-the-art "anything" detection capabilities to X-AnyLabeling. Based on the research paper YOLOE, this model enables real-time detection and segmentation of any object you can describe or indicate visually, eliminating traditional category constraints.
Create a dedicated conda environment for YOLOE:
conda create -n yoloe python=3.10 -y
conda activate yoloeClone and install the YOLOE repository:
git clone https://github.com/THU-MIG/yoloe.git
cd yoloe
pip install -r requirements.txtNote: For complete YOLOE documentation, visit the official repository.
Install X-AnyLabeling in the same environment:
cd ..
git clone https://github.com/CVHub520/X-AnyLabeling.git
cd X-AnyLabeling
pip install -r requirements.txtNote: See our installation guide (English | Chinese) for detailed instructions.
Launch the application:
python anylabeling/app.pyYOLOE.mp4
YOLOE supports three distinct detection modes, each optimized for different use cases:
Specify target objects using natural language descriptions.
Usage:
- Enter object names in the text field (e.g.,
person,car,bicycle) - Separate multiple classes with periods or commas:
person.car.bicycleordog,cat,tree - Click Send to initiate detection
This mode leverages YOLOE's text understanding capabilities to identify objects based on semantic descriptions.
Guide detection by marking examples directly on the image.
Usage:
- Click +Rect to activate drawing mode
- Draw bounding boxes around target objects or regions of interest
- Add multiple prompts for different object instances
- Click Send to process visual cues or use Clear to remove all visual prompts
Visual prompts help YOLOE understand object characteristics through spatial context and appearance.
Detect objects from a predefined class vocabulary without explicit prompting.
Activation: Click Send with no text input and no visual prompts.
Class Configuration: Customize the detection vocabulary in your YOLOE configuration file (e.g., yoloe_v8l.yaml):
Option 1: List Format
type: yoloe
...
classes:
- person
- vehicle
- animal
- furnitureOption 2: Dictionary Format
type: yoloe
...
classes:
0: person
1: vehicle
2: animal
3: furnitureOption 3: External File
type: yoloe
...
classes: /path/to/classes.txtWhere classes.txt contains one class per line:
person
vehicle
animal
furniture
Default Behavior: When no classes field is specified, X-AnyLabeling uses a comprehensive vocabulary from here.
Process multiple images efficiently using consistent detection parameters.
Activation: Press Ctrl+M (Windows/Linux) or Cmd+M (macOS)
Supported Modes:
- Text Prompting: Enter text prompts before starting batch processing to apply them across all images
- Prompt-Free: Leave text field empty to use configured class vocabulary for all images
Note: Visual prompting is not available in batch mode due to its interactive nature.
