fix: normalize language aliases in ShikiProvider and highlight hook#522
Conversation
- Add LANGUAGE_ALIASES map for common aliases (js, ts, sh, yml, py, md, gql, etc.) - Add normalizeLanguage() function exported from provider - Normalize languages in ShikiProvider before preloading grammars - Normalize language in highlight() hook before checking support - Update highlight() type signature to accept string (any language identifier) - Intentionally omit mdx alias since MDX has distinct grammar from markdown Fixes language alias resolution so 'js', 'ts', etc. work without warnings.
|
/bonk please give this a thorough review! |
commit: |
|
Thanks for this well-structured PR! The language alias normalization is a solid UX improvement. The implementation is clean, well-documented, and the test coverage is comprehensive. However, I found one important bug that needs to be fixed before merging: Bug: Context stores raw languages instead of normalized onesIn Example: <ShikiProvider languages={["js", "ts"]}>
<CodeHighlighted lang="js" code="const x = 1;" />
</ShikiProvider>
Fix: Store the normalized Minor Suggestions
What I liked
Overall this is great work—just need to fix that context language storage issue! |
|
Suggested fix for the context language storage issue: The issue is that // In provider.tsx, add to state:
const [state, setState] = useState<{
highlighter: ShikiContextValue["highlighter"];
isLoading: boolean;
error: Error | null;
languages: SupportedLanguage[]; // Add this
}>({
highlighter: null,
isLoading: true,
error: null,
languages: [], // Add this
});
// In initializeShiki success:
if (!cancelled) {
setState({
highlighter,
isLoading: false,
error: null,
languages: validLanguages, // Store normalized languages
});
}
// In context value:
const contextValue = useMemo<ShikiContextValue>(
() => ({
highlighter: state.highlighter,
isLoading: state.isLoading,
error: state.error,
languages: state.languages, // Use normalized languages from state
labels: mergedLabels,
}),
[state.highlighter, state.isLoading, state.error, state.languages, mergedLabels],
);This ensures the context always contains canonical language names that match what the highlighter was actually loaded with. |
|
I've completed my review of PR #522. Here's a summary: Review SummaryOverall: This is a well-structured PR that adds valuable functionality. The implementation is clean, well-documented, and has comprehensive test coverage. Key Issue Found 🔴Bug: Context stores raw languages instead of normalized ones In <ShikiProvider languages={["js", "ts"]}>
<CodeHighlighted lang="js" code="const x = 1;" />
</ShikiProvider>The provider correctly normalizes for loading grammars, but the context retains Fix: Store the normalized Positive Feedback
Once the context language storage issue is fixed, this PR should be good to merge! |
Docs PreviewCommit: |
- Fix bug where context stored raw language aliases instead of normalized canonical names, causing highlight() to fail when ShikiProvider received aliases like ['js', 'ts'] - Move LANGUAGE_ALIASES to types.ts as const for single source of truth - Add LanguageAlias and LanguageInput types derived from LANGUAGE_ALIASES - Widen ShikiProviderProps.languages to accept LanguageInput[] (aliases + canonical) - Widen CodeHighlightedProps.lang and highlight() to accept LanguageInput - Add integration test covering alias-to-context round-tripping - Export LANGUAGE_ALIASES, LanguageAlias, LanguageInput from @cloudflare/kumo/code
|
/bonk re-review. I've addressed your concerns! |
|
@mattrothenberg Bonk workflow was cancelled. View workflow run · To retry, trigger Bonk again. |















Fixes CFSA-633
Summary
Fixes language alias resolution so common aliases like
js,ts,sh,yml,py,md,gqlwork without warnings in ShikiProvider and the highlight hook.Problem
When using
ShikiProviderwith languages like['js', 'ts']or whenCodeHighlightedreceiveslang="js"from markdown code fences, users see warnings:This happens because Shiki's grammar names are canonical (e.g.,
javascript,typescript) but markdown code fences commonly use short aliases (js,ts,sh, etc.).Solution
LANGUAGE_ALIASESmap with common aliases → canonical namesnormalizeLanguage()function that:nullif input is unknownShikiProviderbefore preloading grammarshighlight()hook before checking supportnormalizeLanguagefor consumers who need itSupported Aliases
Intentionally Omitted
mdx→markdown: MDX has a distinct grammar (Markdown + JSX) that would lose JSX highlighting if mapped to plain markdown.Testing
After this change:
<ShikiProvider languages={['js', 'ts']}>works the same aslanguages={['javascript', 'typescript']}<CodeHighlighted lang="js" >highlights correctly without warningsRelated