APPO/README.md at main · GeWu-Lab/APPO

(CVPR'26) APPO: Attention-guided Perception Policy Optimization for Video Reasoning

🚀🚀 Welcome to the repo of APPO! If our project helps you, please give us a star ⭐ on GitHub to support us. 🙏🙏

The core idea behind APPO is to optimize those tokens from different responses that primarily focus on the same crucial video frames (called intra-group perception tokens), resulting in fine-grained token level reward signals.

The Perception-Reasoning curves on SEED-Bench-R1 and Perception-Test benchmarks, quantifying the impact of perception vs. reasoning ability on overall performance.

📰 News

[2026.03.18] Release training and evaluation codes of APPO.
[2026.02.20] APPO has been accepted to CVPR 2026.

🛠️ Requirements and Installation

Basic Dependencies:

Python == 3.10
trl == 0.23.1
transformers == 4.52.3
deepspeed == 0.16.4
accelerate == 1.8.1

Install required packages:

git clone git@github.com:GeWu-Lab/APPO.git
cd APPO
pip install -r requirements.txt

🗝️ RL Training

Training on Seed-Bench-R1 or Video-R1 data:

bash scripts/qwen2_5_vl_rl.sh

Training on NeXT-GQA data:

bash scripts/qwen2_5_vl_rl_tvg.sh

📑 Citation

If you find APPO useful for your research and applications, please cite using this BibTeX:

@article{du2026appo,
  title={APPO: Attention-guided Perception Policy Optimization for Video Reasoning},
  author={Du, Henghui and Zhou, Chang and Chen, Xi and Hu, Di},
  journal={arXiv preprint arXiv:2602.23823},
  year={2026}
}

🔒 License

This project is released under the Apache 2.0 license as found in the LICENSE file. Please get in touch with us if you find any potential violations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(CVPR'26) APPO: Attention-guided Perception Policy Optimization for Video Reasoning

🚀🚀 Welcome to the repo of APPO! If our project helps you, please give us a star ⭐ on GitHub to support us. 🙏🙏

📰 News

🛠️ Requirements and Installation

🗝️ RL Training

📑 Citation

🔒 License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

(CVPR'26) APPO: Attention-guided Perception Policy Optimization for Video Reasoning

🚀🚀 Welcome to the repo of APPO! If our project helps you, please give us a star ⭐ on GitHub to support us. 🙏🙏

📰 News

🛠️ Requirements and Installation

🗝️ RL Training

📑 Citation

🔒 License