Skip to content

Commit 6769f51

Browse files
committed
feat(cli): add agent-backed xcstrings annotation
1 parent e397ad7 commit 6769f51

10 files changed

Lines changed: 2272 additions & 130 deletions

File tree

README.md

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ With one toolchain, you can:
2727
- edit files in place across formats
2828
- normalize files to reduce noisy diffs
2929
- generate draft translations with AI-backed providers
30+
- generate translator-facing xcstrings comments from source usage
3031

3132
## What It Feels Like
3233

@@ -47,15 +48,23 @@ langcodec translate \
4748
--target-lang fr,de,ja \
4849
--provider openai \
4950
--model gpt-4.1-mini
51+
52+
# Generate xcstrings comments with source-aware AI annotation
53+
langcodec annotate \
54+
--input Localizable.xcstrings \
55+
--source-root Sources \
56+
--source-root Modules \
57+
--provider openai \
58+
--model gpt-4.1-mini
5059
```
5160

5261
## Highlights
5362

5463
- Unified data model for singular and plural translations
5564
- Read/write support for Apple, Android, CSV, and TSV formats
56-
- CLI commands for convert, diff, merge, sync, edit, normalize, view, stats, debug, and translate
65+
- CLI commands for convert, diff, merge, sync, edit, normalize, view, stats, debug, translate, and annotate
5766
- `.xcstrings` and Android plural support
58-
- Config-driven translate workflows with `langcodec.toml`
67+
- Config-driven translate and annotate workflows with `langcodec.toml`
5968
- Rust library API for building your own tooling on top
6069

6170
## Installation
@@ -113,28 +122,38 @@ langcodec merge -i a.xcstrings -i b.xcstrings -o merged.xcstrings --strategy las
113122
langcodec sync --source source.xcstrings --target target.xcstrings --match-lang en
114123
```
115124

116-
### Translate with config
125+
### AI workflows with config
117126

118127
Create a `langcodec.toml` in your project:
119128

120129
```toml
121-
[translate]
122-
source = "locales/Localizable.xcstrings"
130+
[ai]
123131
provider = "openai"
124132
model = "gpt-4.1-mini"
133+
134+
[translate]
135+
source = "locales/Localizable.xcstrings"
125136
source_lang = "en"
126137
target_lang = "fr,de"
127138
status = ["new", "stale"]
128139
concurrency = 4
140+
141+
[annotate]
142+
input = "locales/Localizable.xcstrings"
143+
source_roots = ["Sources", "Modules"]
144+
concurrency = 4
129145
```
130146

131147
Then run:
132148

133149
```sh
134150
langcodec translate
151+
langcodec annotate
135152
```
136153

137-
For larger projects, `translate.sources = [...]` can fan out parallel runs from config.
154+
`translate` still accepts legacy `translate.provider` and `translate.model` if you have older config files. For larger projects, `translate.sources = [...]` can fan out parallel runs from config.
155+
156+
`annotate` also supports `annotate.inputs = [...]` for config-driven in-place runs across multiple xcstrings files.
138157

139158
More CLI details live in [langcodec-cli/README.md](langcodec-cli/README.md).
140159

langcodec-cli/README.md

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Supported inputs and outputs:
2222
- edit translations in place
2323
- merge or sync catalogs safely
2424
- draft translations with AI providers
25+
- generate translator-facing xcstrings comments from source usage
2526

2627
Instead of treating localization as a pile of ad hoc file conversions, `langcodec` gives you one CLI that works across common formats and workflows.
2728

@@ -39,6 +40,7 @@ The CLI should teach the detailed usage directly:
3940
langcodec --help
4041
langcodec convert --help
4142
langcodec translate --help
43+
langcodec annotate --help
4244
langcodec view --help
4345
```
4446

@@ -100,26 +102,56 @@ langcodec translate \
100102
- preflight validation before model requests
101103
- translation result summaries at the end
102104

105+
### Generate xcstrings comments with AI
106+
107+
```sh
108+
langcodec annotate \
109+
--input Localizable.xcstrings \
110+
--source-root Sources \
111+
--source-root Modules \
112+
--provider openai \
113+
--model gpt-4.1-mini
114+
```
115+
116+
`annotate` supports:
117+
118+
- filling missing xcstrings comments
119+
- refreshing existing auto-generated comments
120+
- preserving manual comments
121+
- config defaults from `langcodec.toml`
122+
- source shortlisting before agent lookup
123+
- `--dry-run` and `--check` for CI-friendly runs
124+
103125
## Example Config
104126

105127
```toml
106-
[translate]
107-
source = "locales/Localizable.xcstrings"
128+
[ai]
108129
provider = "openai"
109130
model = "gpt-4.1-mini"
131+
132+
[translate]
133+
source = "locales/Localizable.xcstrings"
110134
source_lang = "en"
111135
target_lang = "fr,de"
112136
status = ["new", "stale"]
113137
concurrency = 4
138+
139+
[annotate]
140+
input = "locales/Localizable.xcstrings"
141+
source_roots = ["Sources", "Modules"]
142+
concurrency = 4
114143
```
115144

116145
Then run:
117146

118147
```sh
119148
langcodec translate
149+
langcodec annotate
120150
```
121151

122-
For larger repos, `translate.sources = [...]` can fan out parallel runs from config.
152+
Legacy configs using `translate.provider` and `translate.model` still work. For larger repos, `translate.sources = [...]` can fan out parallel runs from config.
153+
154+
For annotate fan-out runs, use `annotate.inputs = [...]` and omit `annotate.output` so each catalog is updated in place.
123155

124156
## Main Commands
125157

@@ -132,6 +164,7 @@ For larger repos, `translate.sources = [...]` can fan out parallel runs from con
132164
- `sync`: update existing target entries from a source file
133165
- `merge`: combine multiple inputs into one output
134166
- `translate`: draft translations with AI-backed providers
167+
- `annotate`: generate translator-facing xcstrings comments with AI-backed source lookup
135168
- `debug`: inspect parsed output as JSON
136169

137170
## When It Fits Best

langcodec-cli/src/ai.rs

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
use std::sync::Arc;
2+
3+
use mentra::{BuiltinProvider, provider::{self, Provider}};
4+
5+
#[derive(Debug, Clone, PartialEq, Eq)]
6+
pub(crate) enum ProviderKind {
7+
OpenAI,
8+
Anthropic,
9+
Gemini,
10+
}
11+
12+
impl ProviderKind {
13+
pub(crate) fn parse(value: &str) -> Result<Self, String> {
14+
match value.trim().to_ascii_lowercase().as_str() {
15+
"openai" => Ok(Self::OpenAI),
16+
"anthropic" => Ok(Self::Anthropic),
17+
"gemini" => Ok(Self::Gemini),
18+
other => Err(format!(
19+
"Unsupported provider '{}'. Expected one of: openai, anthropic, gemini",
20+
other
21+
)),
22+
}
23+
}
24+
25+
pub(crate) fn display_name(&self) -> &'static str {
26+
match self {
27+
Self::OpenAI => "openai",
28+
Self::Anthropic => "anthropic",
29+
Self::Gemini => "gemini",
30+
}
31+
}
32+
33+
pub(crate) fn api_key_env(&self) -> &'static str {
34+
match self {
35+
Self::OpenAI => "OPENAI_API_KEY",
36+
Self::Anthropic => "ANTHROPIC_API_KEY",
37+
Self::Gemini => "GEMINI_API_KEY",
38+
}
39+
}
40+
41+
pub(crate) fn builtin_provider(&self) -> BuiltinProvider {
42+
match self {
43+
Self::OpenAI => BuiltinProvider::OpenAI,
44+
Self::Anthropic => BuiltinProvider::Anthropic,
45+
Self::Gemini => BuiltinProvider::Gemini,
46+
}
47+
}
48+
}
49+
50+
#[derive(Clone)]
51+
pub(crate) struct ProviderSetup {
52+
pub(crate) provider_kind: ProviderKind,
53+
pub(crate) provider: Arc<dyn Provider>,
54+
}
55+
56+
pub(crate) fn resolve_provider(
57+
cli: Option<&str>,
58+
shared_cfg: Option<&str>,
59+
legacy_cfg: Option<&str>,
60+
) -> Result<ProviderKind, String> {
61+
if let Some(value) = cli {
62+
return ProviderKind::parse(value);
63+
}
64+
if let Some(value) = shared_cfg {
65+
return ProviderKind::parse(value);
66+
}
67+
if let Some(value) = legacy_cfg {
68+
return ProviderKind::parse(value);
69+
}
70+
71+
let mut available = Vec::new();
72+
for kind in [
73+
ProviderKind::OpenAI,
74+
ProviderKind::Anthropic,
75+
ProviderKind::Gemini,
76+
] {
77+
if std::env::var(kind.api_key_env()).is_ok() {
78+
available.push(kind);
79+
}
80+
}
81+
82+
match available.len() {
83+
1 => Ok(available.remove(0)),
84+
0 => Err(
85+
"--provider is required (or set ai.provider in langcodec.toml, or use legacy translate.provider, or configure exactly one provider API key)"
86+
.to_string(),
87+
),
88+
_ => Err(
89+
"Multiple provider API keys are configured; specify --provider or set ai.provider in langcodec.toml"
90+
.to_string(),
91+
),
92+
}
93+
}
94+
95+
pub(crate) fn resolve_model(
96+
cli: Option<&str>,
97+
shared_cfg: Option<&str>,
98+
legacy_cfg: Option<&str>,
99+
) -> Result<String, String> {
100+
cli.map(ToOwned::to_owned)
101+
.or_else(|| shared_cfg.map(ToOwned::to_owned))
102+
.or_else(|| legacy_cfg.map(ToOwned::to_owned))
103+
.or_else(|| std::env::var("MENTRA_MODEL").ok())
104+
.ok_or_else(|| {
105+
"--model is required (or set ai.model in langcodec.toml, or use legacy translate.model, or set MENTRA_MODEL)"
106+
.to_string()
107+
})
108+
}
109+
110+
pub(crate) fn read_api_key(kind: &ProviderKind) -> Result<String, String> {
111+
std::env::var(kind.api_key_env()).map_err(|_| {
112+
format!(
113+
"Missing {} environment variable for {} provider",
114+
kind.api_key_env(),
115+
kind.display_name()
116+
)
117+
})
118+
}
119+
120+
pub(crate) fn build_provider(kind: &ProviderKind) -> Result<ProviderSetup, String> {
121+
let api_key = read_api_key(kind)?;
122+
123+
let provider: Arc<dyn Provider> = match kind {
124+
ProviderKind::OpenAI => Arc::new(provider::openai::OpenAIProvider::new(api_key)),
125+
ProviderKind::Anthropic => Arc::new(provider::anthropic::AnthropicProvider::new(api_key)),
126+
ProviderKind::Gemini => Arc::new(provider::gemini::GeminiProvider::new(api_key)),
127+
};
128+
129+
Ok(ProviderSetup {
130+
provider_kind: kind.clone(),
131+
provider,
132+
})
133+
}

0 commit comments

Comments
 (0)