AI-assisted creation of unfoldingWord Book Packages (BP) -- ULT, UST, translation notes, translation questions, chapter intros, and word-level alignments -- driven entirely by Claude Code skills.
Companion repo: bp-assistant -- Zulip bot that orchestrates these skills via the Claude Agent SDK.
The system runs as a Zulip bot inside a Docker container on an OCI ARM64 server. Users trigger pipelines via Zulip messages; Claude Code executes the appropriate skills. It can also run locally under Windows/WSL for development.
Design philosophy: code where verifiable, prompts where judgment is needed. Deterministic scripts handle mechanical, verifiable tasks (USFM parsing, Hebrew quote extraction, TSV splitting/merging, ID generation, git operations, Door43 push). LLM prompts handle semantic decisions requiring linguistic judgment (translation, issue identification, note writing). This split evolved from experience -- AI was unreliable and slow at deterministic tasks (confabulating git results, botching file operations), while scripts couldn't handle the contextual judgment calls. The test-poc skill exists to A/B test where the boundary falls for new tasks.
# Clone the repo
git clone https://github.com/unfoldingWord/bp-assistant-skills.git
cd bp-assistant-skills
# Install Node dependencies (used by utility scripts)
npm install
# Fetch reference data (Hebrew Bible, Translation Words, etc.)
# These are gitignored. When running via the bot, use the "setup data" DM command.
# For local use, run the curation script directly:
node .claude/skills/utilities/scripts/curate-published-data.mjs
# Other data directories (en_tw, ta-flat, etc.) should be
# cloned from their respective Door43 repos into data/Skills are designed to run under Claude Code. Start a session in the repo root:
claudeThen ask Claude to run any skill, e.g., "generate ULT for PSA 23" or "write notes for HAB 3".
When deployed alongside the bp-assistant bot, skills are invoked automatically via Zulip commands. The bot mounts this repo at /workspace inside Docker and uses the Claude Agent SDK to execute skills.
The full book package pipeline (makeBP) orchestrates these stages with maximum parallelization:
| Stage | Skill | Description |
|---|---|---|
| 1. Literal Translation | ULT-gen |
Hebrew USFM to literal English (ULT) |
| 2. Issue Identification | issue-identification |
Find translation issues across 94 issue types |
| 3. Simplified Translation | UST-gen |
T4T to meaning-based English (UST) |
| 4. Chapter Introduction | chapter-intro |
Translator-oriented chapter intros |
| 5. Translation Notes | tn-writer |
Generate notes from identified issues |
| 6. Translation Questions | tq-writer |
Update Q&A content for current ULT/UST |
| 7. ULT Alignment | ULT-alignment |
Word-level Hebrew-to-ULT alignment |
| 8. UST Alignment | UST-alignment |
Phrase-level Hebrew-to-UST alignment |
| 9. Repo Insertion | repo-insert |
Insert into Door43 repos, commit, create PRs |
The initial-pipeline skill runs stages 1-3 as a coordinated 6-wave pipeline with an adversarial issue identification loop and ULT feedback.
All skills live in .claude/skills/. Each has a SKILL.md defining the prompt and a scripts/ directory for deterministic tooling.
ULT-gen-- Hebrew USFM to highly literal English preserving source form and structureUST-gen-- T4T to natural meaning-based English with implicit information made explicitchapter-intro-- Translator-oriented chapter introductions
issue-identification-- 94 translation issue types (figures of speech, abstract nouns, grammatical patterns, cultural concepts), each with its own skill filedeep-issue-id-- Multi-agent adversarial issue identification: 2 domain analysts (structure + rhetoric) + challenger. Supports--versesfor chunked analysispost-edit-review-- Diff-based adaptation of issues after human edits to ULT/UST
tn-writer-- Deterministic prep script + LLM note generation following style guidetn-quality-check-- Full semantic review of generated notes. When run from the pipeline, mechanical checks (ID format, Hebrew quotes, AT syntax, bold accuracy) are pre-run in Node.js before the skill invokes — Claude reads findings and does one fix pass, no re-check loop.parallel-batch-- Split long chapters into verse-range chunks, run tn-writer in parallel, merge results (respects PSA 119 stanza boundaries)tq-writer-- Update translation questions to align with current ULT/UST
ULT-alignment-- Word-level Hebrew-to-ULT; AI produces index-based JSON, script converts to aligned USFMUST-alignment-- Phrase-level Hebrew-to-UST with radical restructuring and implied information handlingalign-all-parallel-- Run both alignments in parallel as subagents
makeBP-- End-to-end book package for a chapter with maximum parallelizationinitial-pipeline-- 6-wave coordinated pipeline (ULT, issues, UST) with adversarial feedback looppipeline-overview-- Guides to the appropriate skill for each stage
editor-compare-- Compare editor-edited ULT/UST against AI output; identifies preferences and feeds them back into glossary and skill instructions. Protects canonical files from modification with a write-guard and weekly refresh check.gemini-review-- Independent Gemini-based second-opinion reviewer across all pipeline stagestest-poc-- A/B comparison of prompt-over-code vs previous script workflows
tn-quick-- Quick scratchpad note writing outside the full pipeline
repo-insert-- Insert ULT, UST, or TN content into Door43 repo clones, commit, create PRs via Gitea APIrepo-verify-- Verify that a push landed on Door43 by comparing local and remote contentcreate-issue-description-- Create or update issue identification skill filesutilities-- Shared scripts: USFM parsing (usfm-js), Proskomma queries, alignment validation, Hebrew Bible fetching, Strong's index building, curly quotes, and more
.claude/
skills/ # All skill definitions and scripts
agents/ # Subagent definitions (issue-identification)
hooks/ # Git hooks (check-untracked.sh)
data/ # Reference data (not committed)
hebrew_bible/ # Hebrew USFM with Strong's numbers and morphology
en_tw/ # English Translation Words
published-tns/ # Published Translation Notes
published_ult/ # Published ULT (Hebrew aligned)
published_ust/ # Published UST (Hebrew aligned)
t4t/ # Translation for Translators
ta-flat/ # Translation Academy (flat format)
glossary/ # Hebrew vocabulary, biblical phrases, measurements, sacrifices
editor-feedback/ # Editor corrections and preference tracking
cache/ # Generated indexes (Strong's, etc.)
quick-ref/ # Accumulated ULT-gen decisions (CSV)
output/ # Generated files, organized by book subfolder
AI-ULT/ # Generated ULT
AI-UST/ # Generated UST
issues/ # Identified translation issues (TSV)
notes/ # Generated translation notes (TSV)
tq/ # Translation questions
quality/ # Quality check reports
editor-compare/ # Editor comparison reports
In production, this repo is mounted at /workspace inside the bp-assistant Docker container. The bot uses the Claude Agent SDK to invoke skills and push results to Door43. See the companion repo for bot architecture, routing, and Docker setup.
Skills draw from these sources (in priority order):
- Issues Resolved (
data/issues_resolved.txt) -- Content team decisions (final authority) - TN Templates (
data/templates.csv) -- Official note templates - Canonical Glossary CSVs (read-only; never modified by AI):
data/glossary/hebrew_ot_glossary.csv-- Standard ULT/UST renderingsdata/glossary/psalms_reference.csv-- Psalms-specific termsdata/glossary/sacrifice_terminology.csv-- Sacrifice and offering vocabularydata/glossary/biblical_measurements.csv-- Weights, volumes, distancesdata/glossary/biblical_phrases.csv-- Grammatical and prophetic phrases
- Prior Rendering Decisions (
data/quick-ref/ult_decisions.csv,ust_decisions.csv) -- Accumulated per-run decisions - Published Translation Notes -- Human-identified examples
- Translation Academy (
data/ta-flat/) -- Definitions and explanations - Editor Feedback (
data/editor-feedback/) -- Accumulated preferences from editor-compare runs
The 5 glossary CSVs and issues_resolved.txt are protected from modification. editor-compare enforces this with a write-guard; all generation skills (ULT-gen, UST-gen, tn-writer) treat them as read-only references.
- Node.js 22+ (usfm-js ^3.4.3 for USFM parsing, Proskomma for word lookups)
- Claude Code CLI (Agent SDK for skill execution)