Skip to content

Latest commit

 

History

History
81 lines (54 loc) · 2.34 KB

File metadata and controls

81 lines (54 loc) · 2.34 KB
🚀🚀 Welcome to the repo of APPO! If our project helps you, please give us a star ⭐ on GitHub to support us. 🙏🙏

hf_checkpoint arXiv

The core idea behind APPO is to optimize those tokens from different responses that primarily focus on the same crucial video frames (called intra-group perception tokens), resulting in fine-grained token level reward signals.

The Perception-Reasoning curves on SEED-Bench-R1 and Perception-Test benchmarks, quantifying the impact of perception vs. reasoning ability on overall performance.

📰 News

  • [2026.03.18] Release training and evaluation codes of APPO.
  • [2026.02.20] APPO has been accepted to CVPR 2026.

🛠️ Requirements and Installation

Basic Dependencies:

  • Python == 3.10
  • trl == 0.23.1
  • transformers == 4.52.3
  • deepspeed == 0.16.4
  • accelerate == 1.8.1

Install required packages:

git clone git@github.com:GeWu-Lab/APPO.git
cd APPO
pip install -r requirements.txt

🗝️ RL Training

Training on Seed-Bench-R1 or Video-R1 data:

bash scripts/qwen2_5_vl_rl.sh

Training on NeXT-GQA data:

bash scripts/qwen2_5_vl_rl_tvg.sh

📑 Citation

If you find APPO useful for your research and applications, please cite using this BibTeX:

@article{du2026appo,
  title={APPO: Attention-guided Perception Policy Optimization for Video Reasoning},
  author={Du, Henghui and Zhou, Chang and Chen, Xi and Hu, Di},
  journal={arXiv preprint arXiv:2602.23823},
  year={2026}
}

🔒 License

This project is released under the Apache 2.0 license as found in the LICENSE file. Please get in touch with us if you find any potential violations.