Multi-Layer AI Quality Assurance for Content Generation. Multiple LLMs evaluate, score, and approve every output before delivery.
-
Updated
Jun 18, 2026 - Python
Multi-Layer AI Quality Assurance for Content Generation. Multiple LLMs evaluate, score, and approve every output before delivery.
A framework for testing LLM-based chatbots in regulated industries (telco, banking, insurance). Covers hallucination detection, prompt injection resistance, response quality scoring and regression testing.
Code and data for the experiments reported in the research article "Towards Automated FAIR Compliance Diagnosis: Evaluating LLMs on Explanation and Diagnosis Questions" accepted at QKG@ESWC 2026.
Add a description, image, and links to the llm-qa topic page so that developers can more easily learn about it.
To associate your repository with the llm-qa topic, visit your repo's landing page and select "manage topics."