You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,8 @@
6
6
## ✨ Latest Updates
7
7
8
8
9
+
📆 [**2025-06-05**] : The code for **Oriented GLIP**, **Oriented GroundingDINO**, and **Oriented ViLD** are now available!
10
+
9
11
📆 [**2025-02-08**] : The code for **Oriented CastDet** is now available! 🎉 CastDet now supports Open-vocabulary Oriented Aerial Object Detection. Stay tuned—**Oriented GLIP**, **Oriented GroundingDINO**, and **Oriented ViLD** are coming soon! 🚀
10
12
11
13
📆 [**2024-11-04**] : Our paper ["Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation"](https://arxiv.org/abs/2411.02057) is available open on arxiv!
# [Oriented ViLD] Open-Vocabulary Detection via Vision and Language Knowledge Distillation
2
+
3
+
-[Open-Vocabulary Detection via Vision and Language Knowledge Distillation](https://arxiv.org/abs/2104.13921)
4
+
5
+
## Introduction
6
+
7
+
Open-vocabulary object detection detects objects described by arbitrary text inputs. The fundamental challenge is the availability of training data. Existing object detection datasets only contain hundreds of categories, and it is costly to scale further. To overcome this challenge, we propose ViLD. Our method distills the knowledge from a pretrained open-vocabulary image classification model (teacher) into a two-stage detector (student). Specifically, we use the teacher model to encode category texts and image regions of object proposals. Then we train a student detector, whose region embeddings of detected boxes are aligned with the text and image embeddings inferred by the teacher.
Thanks the wonderful open source projects [MMRotate](https://github.com/open-mmlab/mmrotate) and [ViLD](https://github.com/tensorflow/tpu/tree/master/models/official/detection/projects/vild)!
73
+
74
+
75
+
## Citation
76
+
77
+
```
78
+
// Oriented ViLD (this repo)
79
+
@misc{li2024exploitingunlabeleddatamultiple,
80
+
title={Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation},
81
+
author={Yan Li and Weiwei Guo and Xue Yang and Ning Liao and Shaofeng Zhang and Yi Yu and Wenxian Yu and Junchi Yan},
82
+
year={2024},
83
+
eprint={2411.02057},
84
+
archivePrefix={arXiv},
85
+
primaryClass={cs.CV},
86
+
url={https://arxiv.org/abs/2411.02057},
87
+
}
88
+
89
+
// ViLD (Horizontal detection)
90
+
@article{gu2021open,
91
+
title={Open-vocabulary object detection via vision and language knowledge distillation},
92
+
author={Gu, Xiuye and Lin, Tsung-Yi and Kuo, Weicheng and Cui, Yin},
0 commit comments