Skip to content

Commit 6ca4f9d

Browse files
committed
fix: improve caption handling and refresh docs
1 parent 612249e commit 6ca4f9d

4 files changed

Lines changed: 200 additions & 20 deletions

File tree

README.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,8 @@ Web2Comics is a Chrome extension that:
2020
5. Turn on `Developer mode` (top-right).
2121
6. Click `Load unpacked`.
2222
7. Select the extracted `Web2Comics` folder (the folder containing `manifest.json`).
23-
8. (Optional) Pin the extension from Chrome’s extensions menu.
23+
8. Web2Comics will open the Options page on first install so you can configure providers.
24+
9. (Optional) Pin the extension from Chrome’s extensions menu (Chrome does not allow extensions to pin themselves automatically).
2425

2526
### Option B: Clone the repo (developer workflow)
2627

@@ -115,7 +116,7 @@ For step-by-step key/token instructions, see:
115116
2. Click the Web2Comics extension icon.
116117
3. Click `Create Comic`.
117118
4. Choose provider/style (advanced settings optional).
118-
5. Click `Generate`.
119+
5. If no providers are configured yet, use `Configure Model Providers` in the popup (or the Options page opened on install), then return and click `Generate`.
119120
6. Watch live progress in popup/sidepanel.
120121
7. Review the comic in the side panel.
121122
8. Click `Download` to export a single comic sheet PNG.
@@ -171,6 +172,9 @@ HUGGINGFACE_INFERENCE_API_TOKEN=...
171172
- Provider not visible in popup:
172173
- Configure credentials in `Options -> Providers`
173174
- Click `Validate`
175+
- OpenAI key validates but model test fails:
176+
- The key may belong to a different OpenAI project or have different model access/billing
177+
- Re-paste the key and use `Test Text Model` / `Test Image Model`
174178
- Gemini free tier says quota/limit `0`:
175179
- Check AI Studio project eligibility/region and active limits
176180
- Generation fails on one provider:

docs/user-manual.html

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ <h1>Web2Comics User Manual</h1>
180180

181181
<section>
182182
<h2 id="popup-overview">Popup / Generator Wizard Overview</h2>
183-
<p>The popup is the main entry point from the extension toolbar icon. It contains onboarding, the action launcher, the comic generator wizard, progress view, and a lightweight history modal.</p>
183+
<p>The popup is the main entry point from the extension toolbar icon. It contains the action launcher, the comic generator wizard, progress view, and a lightweight history modal.</p>
184184
<div class="grid-2">
185185
<div class="card">
186186
<h3>Launcher</h3>
@@ -217,8 +217,8 @@ <h3>Step 2: Configure Comic</h3>
217217
<tr><td>Advanced Options</td><td>Shows detail level, style presets, and custom style creation/selection.</td></tr>
218218
<tr><td>Detail</td><td>Controls storyboard richness (low/medium/high).</td></tr>
219219
<tr><td>Style</td><td>Preset art direction or a custom saved style.</td></tr>
220-
<tr><td>Custom Style (One-off)</td><td>Enter a temporary style name and description for the current generation.</td></tr>
221-
<tr><td>Create New Style…</td><td>Saves a reusable custom style to extension storage for future sessions.</td></tr>
220+
<tr><td>Custom (One-off)</td><td>Uses current custom-style defaults without creating a saved style entry.</td></tr>
221+
<tr><td>Create New Style…</td><td>Opens a modal to enter style name + description, saves it, and adds it to the style list for future sessions.</td></tr>
222222
</table>
223223

224224
<h3>Readiness and Generate</h3>
@@ -231,7 +231,7 @@ <h3>Readiness and Generate</h3>
231231

232232
<section>
233233
<h2 id="popup-progress">Popup Progress View</h2>
234-
<p>After generation starts, Web2Comics can automatically open the side panel while the popup shows progress.</p>
234+
<p>After generation starts, Web2Comics can attempt to open the side panel while the popup shows progress (Chrome may block auto-open unless it is triggered by a direct user gesture).</p>
235235
<ul>
236236
<li><strong>Status text</strong>: current phase (storyboard or image rendering).</li>
237237
<li><strong>Progress bar</strong>: panel-level completion progress.</li>
@@ -246,6 +246,7 @@ <h2 id="popup-history">Popup History</h2>
246246
<p>The popup history modal is a quick-access list of previously generated comics.</p>
247247
<ul>
248248
<li>Shows recent items with thumbnails and source metadata.</li>
249+
<li>Each item has an individual <strong>Delete</strong> action (with confirmation).</li>
249250
<li>Useful when you want to reopen a comic quickly without opening the side panel browser.</li>
250251
<li>The full history browsing experience is available in the side panel <em>History Browser</em>.</li>
251252
</ul>
@@ -265,15 +266,16 @@ <h3>Generation Defaults</h3>
265266
<ul>
266267
<li><strong>Default Panel Count</strong>: starting value used in popup.</li>
267268
<li><strong>Detail Level</strong>: default storyboard detail level.</li>
268-
<li><strong>Default Style</strong>: default art style preset for new generations.</li>
269-
<li><strong>Custom Style Description</strong>: default custom style description when <code>Custom</code> is selected.</li>
269+
<li><strong>Default Style</strong>: default art style preset or saved custom style for new generations.</li>
270+
<li><strong>Custom Style Name / Description</strong>: shown when creating a new reusable custom style entry from Options.</li>
270271
<li><strong>Caption Length</strong>: preferred caption length for generated panels.</li>
271272
</ul>
273+
<p>If you choose <code>Custom...</code> in <strong>Default Style</strong>, the inline custom-style editor appears with a <strong>Create Style</strong> button. After creating a style, it is selected in the list and the inline editor is hidden again.</p>
272274
<p><strong>Free-tier-first default setup:</strong> Web2Comics starts with <strong>Google Gemini</strong> selected for both text and image generation (when eligible/configured), plus a low-cost default generation profile (fewer panels and lower detail) to maximize the chance that a first run succeeds on free-tier limits.</p>
273275

274276
<h3>Behavior</h3>
275277
<ul>
276-
<li><strong>Automatically open comic viewer after generation</strong>: opens side panel on completion (and optionally progress start behavior may also open it).</li>
278+
<li><strong>Automatically open comic viewer after generation</strong>: attempts to open the side panel on completion (Chrome may block auto-open without a user gesture).</li>
277279
<li><strong>Enable character consistency mode</strong>: provider hint to keep characters visually consistent across panels.</li>
278280
<li><strong>Debug flag</strong>: shows detailed provider/panel errors and enables richer local debugging behavior.</li>
279281
</ul>
@@ -306,7 +308,7 @@ <h3>Provider Cards</h3>
306308

307309
<h3>Validation</h3>
308310
<ul>
309-
<li>Use each provider card’s <strong>Validate</strong> button after entering credentials.</li>
311+
<li>Use each provider card’s <strong>Validate</strong> button after entering credentials (buttons show a spinner while waiting).</li>
310312
<li>Web2Comics persists provider validation state in local storage and uses it to gate popup provider visibility/readiness.</li>
311313
<li>If a selected provider fails due to quota/budget/billing limits during generation, Web2Comics can automatically fall back to other configured providers (free-tier-first) for text and/or image generation.</li>
312314
<li>Google Gemini is the default first-run provider because it can cover both text and image generation with one key, but free-tier availability depends on Google account/project eligibility and region.</li>
@@ -324,10 +326,11 @@ <h3>OpenAI Model / Speed Controls</h3>
324326

325327
<section>
326328
<h2 id="options-prompts">Options: Prompts</h2>
327-
<p>Prompt templates let you customize storyboard and image prompts for supported providers.</p>
329+
<p>Prompt templates let you customize storyboard and image prompts per provider scope.</p>
328330
<h3>Current support</h3>
329331
<ul>
330-
<li>Provider scopes: <strong>OpenAI</strong> and <strong>Google Gemini</strong>.</li>
332+
<li>Provider scopes in the UI: <strong>OpenAI</strong>, <strong>Google Gemini</strong>, <strong>Cloudflare Workers AI</strong>, <strong>OpenRouter</strong>, and <strong>Hugging Face</strong>.</li>
333+
<li>Runtime template consumption (current build): <strong>OpenAI</strong> and <strong>Google Gemini</strong>.</li>
331334
<li>Template types: <strong>Storyboard</strong> and <strong>Image</strong>.</li>
332335
</ul>
333336
<h3>Validation behavior</h3>
@@ -391,10 +394,10 @@ <h3>Display modes</h3>
391394
<li><strong>Panel View</strong>: grid/list view of all panels.</li>
392395
</ul>
393396
<h3>Layout Presets</h3>
394-
<p>Phase-1 layout presets visually restyle existing strip/grid/carousel engines (for example <em>Classic Strip</em>, <em>Cinema Carousel</em>, <em>Contact Sheet</em>).</p>
397+
<p>Web2Comics includes a broad layout preset library (for example <em>Single panel</em>, <em>4-panel grid</em>, <em>Classic comic page</em>, <em>Manga page</em>, <em>Webtoon scroll</em>, <em>Masonry</em>, <em>Guided path</em>, and <em>Carousel</em>). In the current implementation, presets are fully functional layout/view variants built on the side panel render engines and preset-specific styling rules.</p>
395398
<h3>Download Export</h3>
396399
<ul>
397-
<li>Exports a single composite PNG contact sheet image.</li>
400+
<li>Exports a single composite PNG image using the currently selected layout preset.</li>
398401
<li>Includes source title, source URL, source short-name (for example <code>cnn</code>), panel thumbnails, and captions.</li>
399402
</ul>
400403
<h3>Policy handling indicators</h3>
@@ -411,7 +414,7 @@ <h2 id="sidepanel-generation-view">Side Panel: Generation View</h2>
411414
<li>Shows a comic-like placeholder shell during generation.</li>
412415
<li>Uses live panel statuses: <code>Pending</code>, <code>Sent</code>, <code>Receiving</code>, <code>Rendering</code>, <code>Completed</code>, <code>Error</code>.</li>
413416
<li>Matches the currently selected view mode/layout preset for continuity.</li>
414-
<li><strong>Cancel</strong> stops the active job.</li>
417+
<li><strong>Cancel</strong> stops the active job and shows a <code>Canceling...</code> state while cancellation is being processed.</li>
415418
</ul>
416419
</section>
417420

sidepanel/sidepanel.js

Lines changed: 94 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,65 @@ class ComicViewer {
116116
return labels[String(providerId || '')] || String(providerId || 'provider');
117117
}
118118

119-
getPanelCaptionText(panel, index) {
119+
isCaptionPlaceholderPanel(panel) {
120+
if (!panel) return true;
121+
if (typeof panel !== 'object') return false;
122+
const keys = Object.keys(panel);
123+
if (!keys.length) return true;
124+
// Placeholder panel objects can contain runtime status only while the storyboard fills in.
125+
if (keys.length === 1 && keys[0] === 'runtime_status') return true;
126+
return false;
127+
}
128+
129+
looksLikeImagePromptText(text) {
130+
const s = String(text || '').trim();
131+
if (!s) return false;
132+
const lower = s.toLowerCase();
133+
if (s.length > 220) return true;
134+
const promptPhrases = [
135+
'comic panel illustration',
136+
'illustration of',
137+
'digital art',
138+
'cinematic lighting',
139+
'highly detailed',
140+
'camera angle',
141+
'art style',
142+
'dramatic lighting',
143+
'ultra detailed'
144+
];
145+
if (promptPhrases.some((p) => lower.includes(p))) return true;
146+
const commaCount = (s.match(/,/g) || []).length;
147+
if (commaCount >= 6) return true;
148+
return false;
149+
}
150+
151+
getStoryLikeFallbackCaption(panel) {
152+
if (!panel || typeof panel !== 'object') return '';
153+
const candidates = [
154+
panel.beat_summary,
155+
panel.summary,
156+
panel.beat,
157+
panel.narration,
158+
panel.description,
159+
panel.title,
160+
panel.text,
161+
panel.text_content,
162+
panel.caption_text,
163+
panel.dialogue
164+
];
165+
for (const candidate of candidates) {
166+
const normalized = this.normalizeCaptionValue(candidate);
167+
if (normalized && !this.looksLikeImagePromptText(normalized)) return normalized;
168+
}
169+
return '';
170+
}
171+
172+
getPanelCaptionText(panel, index, options = {}) {
173+
const suppressMissingLog = !!options.suppressMissingLog;
174+
const fallbackLabel = options.fallbackLabel || ('Panel ' + ((Number(index) || 0) + 1));
175+
if (this.isCaptionPlaceholderPanel(panel)) {
176+
return fallbackLabel;
177+
}
120178
var caption =
121179
(panel && (
122180
panel.caption ||
@@ -131,9 +189,21 @@ class ComicViewer {
131189
panel.dialogue
132190
)) || '';
133191
caption = this.normalizeCaptionValue(caption);
134-
if (caption) return caption;
135-
this.logMissingCaption(panel, index);
136-
return 'Panel ' + ((Number(index) || 0) + 1);
192+
if (caption) {
193+
const imagePrompt = this.normalizeCaptionValue(
194+
panel && (panel.image_prompt || panel.prompt || panel.imagePrompt || panel.visual_prompt || panel.scene_prompt)
195+
);
196+
if (this.looksLikeImagePromptText(caption) || (imagePrompt && caption === imagePrompt)) {
197+
const storyLike = this.getStoryLikeFallbackCaption(panel);
198+
if (storyLike && storyLike !== caption) {
199+
this.logPromptLikeCaptionSubstitution(panel, index, caption, storyLike);
200+
return storyLike;
201+
}
202+
}
203+
return caption;
204+
}
205+
if (!suppressMissingLog) this.logMissingCaption(panel, index);
206+
return fallbackLabel;
137207
}
138208

139209
normalizeCaptionValue(value) {
@@ -185,6 +255,22 @@ class ComicViewer {
185255
} catch (_) {}
186256
}
187257

258+
logPromptLikeCaptionSubstitution(panel, index, originalCaption, replacementCaption) {
259+
try {
260+
const sourceUrl = String(this.currentComic?.source?.url || '');
261+
const panelId = String(panel?.panel_id || index || 0);
262+
const key = `${sourceUrl}|${panelId}|prompt-like`;
263+
if (this.missingCaptionNoticeKeys.has(key)) return;
264+
this.missingCaptionNoticeKeys.add(key);
265+
void this.appendDebugLog('caption.prompt_like_substituted', {
266+
panelIndex: Number(index) || 0,
267+
panelId: panel?.panel_id || null,
268+
originalPreview: String(originalCaption || '').slice(0, 200),
269+
replacementPreview: String(replacementCaption || '').slice(0, 200)
270+
});
271+
} catch (_) {}
272+
}
273+
188274
async loadPrefs() {
189275
const { sidepanelPrefs } = await chrome.storage.local.get('sidepanelPrefs');
190276
const rawPreset = sidepanelPrefs?.layoutPreset;
@@ -788,7 +874,10 @@ class ComicViewer {
788874
: '';
789875
const isCurrent = displayStatus === 'sent' || displayStatus === 'receiving' || displayStatus === 'rendering';
790876

791-
const safeCaption = this.escapeHtml(this.getPanelCaptionText(panel, index));
877+
const safeCaption = this.escapeHtml(this.getPanelCaptionText(panel, index, {
878+
suppressMissingLog: true,
879+
fallbackLabel: 'Panel ' + (index + 1)
880+
}));
792881
return `
793882
<div class="gen-panel ${isCurrent ? 'is-current' : ''}">
794883
<div class="gen-panel-thumb">

tests/integration/sidepanel-page.test.js

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -878,6 +878,46 @@ describe('Sidepanel Page UX', () => {
878878
expect(String(carouselCaption.textContent)).not.toContain('[object Object]');
879879
});
880880

881+
it('prefers story-like caption text when caption looks like an image prompt', async () => {
882+
const item = makeHistoryItem(1);
883+
item.storyboard.panels = [
884+
{
885+
panel_id: 'panel_prompty',
886+
caption: 'Comic panel illustration of: A dramatic newsroom scene, cinematic lighting, digital art, highly detailed, camera angle from above, ultra detailed editorial style',
887+
beat_summary: 'Newsroom reacts to a major breaking update.',
888+
image_prompt: 'Comic panel illustration of: A dramatic newsroom scene, cinematic lighting, digital art, highly detailed, camera angle from above, ultra detailed editorial style',
889+
artifacts: {
890+
image_blob_ref:
891+
'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMB/axlF8UAAAAASUVORK5CYII='
892+
}
893+
}
894+
];
895+
const setCalls = [];
896+
chrome.storage.local.set.mockImplementation(async (payload) => {
897+
setCalls.push(payload);
898+
});
899+
chrome.storage.local.get.mockImplementation(async (key) => {
900+
if (key === 'currentJob') return { currentJob: { status: 'completed', storyboard: item.storyboard } };
901+
if (key === 'history') return { history: [item] };
902+
if (key === 'sidepanelPrefs') return { sidepanelPrefs: {} };
903+
if (key === 'debugLogs') return { debugLogs: [] };
904+
return {};
905+
});
906+
907+
await import('../../sidepanel/sidepanel.js');
908+
document.dispatchEvent(new Event('DOMContentLoaded'));
909+
await flush();
910+
await flush();
911+
912+
const panelCaption = document.querySelector('.comic-strip .panel-caption');
913+
expect(String(panelCaption.textContent)).toContain('Newsroom reacts to a major breaking update.');
914+
expect(String(panelCaption.textContent)).not.toContain('cinematic lighting');
915+
916+
const debugSet = setCalls.find((p) => Array.isArray(p.debugLogs));
917+
expect(debugSet).toBeTruthy();
918+
expect(debugSet.debugLogs.some((e) => e.event === 'caption.prompt_like_substituted')).toBe(true);
919+
});
920+
881921
it('logs caption.missing when a panel has no usable caption fields', async () => {
882922
const item = makeHistoryItem(1);
883923
item.storyboard.panels = [
@@ -913,6 +953,50 @@ describe('Sidepanel Page UX', () => {
913953
expect(last.data.panelId).toBe('panel_x');
914954
});
915955

956+
it('does not log caption.missing for expected generation placeholders', async () => {
957+
const setCalls = [];
958+
chrome.storage.local.set.mockImplementation(async (payload) => {
959+
setCalls.push(payload);
960+
});
961+
chrome.storage.local.get.mockImplementation(async (key) => {
962+
if (key === 'currentJob') {
963+
return {
964+
currentJob: {
965+
status: 'generating_images',
966+
settings: { panel_count: 6 },
967+
storyboard: {
968+
source: { url: 'https://example.com/article', title: 'Example' },
969+
panels: Array.from({ length: 5 }, (_, i) => ({
970+
caption: 'Panel caption ' + (i + 1),
971+
runtime_status: i < 2 ? 'completed' : 'rendering',
972+
artifacts: i < 2 ? {
973+
image_blob_ref:
974+
'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMB/axlF8UAAAAASUVORK5CYII='
975+
} : {}
976+
}))
977+
}
978+
}
979+
};
980+
}
981+
if (key === 'history') return { history: [] };
982+
if (key === 'sidepanelPrefs') return { sidepanelPrefs: {} };
983+
if (key === 'debugLogs') return { debugLogs: [] };
984+
return {};
985+
});
986+
987+
await import('../../sidepanel/sidepanel.js');
988+
document.dispatchEvent(new Event('DOMContentLoaded'));
989+
await flush();
990+
await flush();
991+
992+
const captions = Array.from(document.querySelectorAll('#gen-panels .gen-panel-caption')).map((el) => String(el.textContent || ''));
993+
expect(captions).toContain('Panel 6');
994+
995+
const debugSets = setCalls.filter((p) => Array.isArray(p.debugLogs));
996+
const captionMissingLogs = debugSets.flatMap((p) => p.debugLogs).filter((e) => e && e.event === 'caption.missing');
997+
expect(captionMissingLogs.length).toBe(0);
998+
});
999+
9161000
it('escapes history/comic text and sanitizes unsafe source links', async () => {
9171001
const malicious = makeHistoryItem(1);
9181002
malicious.id = 'x"><svg onload=alert(1)>';

0 commit comments

Comments
 (0)