#

training-time-alignment

Here is 1 public repository matching this topic...

Iamyulx / behavior-controlled-rlhf

A training-time alignment framework that integrates safety constraints directly into the RLHF loop — achieving full safety convergence in 7 epochs

nlp reinforcement-learning pytorch behavior-control rlhf reward-model llm-alignment training-time-alignment

Updated Apr 15, 2026
Python

Improve this page

Add a description, image, and links to the training-time-alignment topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the training-time-alignment topic, visit your repo's landing page and select "manage topics."