Skip to content

jlorenzetti/text-wrap-minor-words

Repository files navigation

text-wrap-minor-words

CI Release

Experimental, CSS-first polyfill that augments text-wrap: pretty by biasing against line breaks immediately after minor words (articles, prepositions, short conjunctions) in languages where this is a widely accepted typesetting convention. It also applies a couple of safe, language-agnostic joins (e.g., Fig. 2, 20 °C).

Status: experimental. See explainer.md. Live demo: https://jlorenzetti.github.io/text-wrap-minor-words/

Motivation (lean)

text-wrap: pretty improves paragraph breaking but does not let authors express locale-aware preferences about breaking immediately after minor words. Many European languages treat this as a common editorial convention even in body text. This library offers a CSS-first polyfill so authors can experiment today and help inform standardization.

Install

npm i text-wrap-minor-words

Usage (ESM)

import { init } from 'text-wrap-minor-words';

// Process elements that compute to text-wrap: pretty
const ctrl = init({ observe: true });

// Optionally process a specific subtree later:
// ctrl.process(element);

HTML markup should declare the language (lang) on blocks:

<main class="typo">
  <p lang="it">Vado a casa con la bici.</p>
  <p lang="fr">Je vais à Paris.</p>
  <p lang="pl">Jestem w domu i czekam.</p>
  <p lang="en">See Fig. 2 for details.</p>
  <p lang="en">It was 20 °C at 9:30 am.</p>
  <h2 lang="en">A display heading if you want to opt-in later</h2>
  <!-- The library acts only where text-wrap: pretty is in effect -->
  <!-- NBSP is inserted where appropriate; content remains otherwise intact. -->
  <!-- Apostrophes/elision are intentionally out of scope for now. -->
  </main>

Usage (Browser global)

<script src="https://cdn.jsdelivr.net/npm/text-wrap-minor-words@0.3.1/dist/index.global.js"></script>
<script>
  const ctrl = TextWrapMinorWords.init({ observe: true });
  // ctrl.process(document.querySelector('.typo'));
  // ctrl.disconnect();
  </script>

What it does

  • Extends text-wrap: pretty behavior by inserting NBSP after minor words in languages where this is customary (Romance, Slavic, Greek by default).
  • Applies safe joins regardless of language:
    • label + number: Fig. 2, p. 12, § 5
    • number + unit: 20 °C, 9:30 am
    • honorific + Name: Mr. Smith, Dr. Müller
    • initials sequence: J. K. Rowling
    • numeric ranges: adds WORD JOINER around the dash

Lite usage (single‑locale, recommended for production)

Load only the core engine and register the locales you actually use. This keeps bundles small and avoids shipping unnecessary language data.

// Load the lite entry (no locales included)
import { init, registerLanguage } from 'text-wrap-minor-words/lite';

// Register only the locales you need (example: Italian)
import it from 'text-wrap-minor-words/locales/it.json';
registerLanguage('it', it);

// Optionally preload the same tags here (helps the engine avoid a first lookup)
init({ languages: ['it'] });

Browser global (lite):

<script type="module">
  import { init, registerLanguage } from 'https://cdn.jsdelivr.net/npm/text-wrap-minor-words@0.3.1/dist/lite.mjs';
  import it from 'https://cdn.jsdelivr.net/npm/text-wrap-minor-words@0.3.1/locales/it.json' assert { type: 'json' };

  registerLanguage('it', it);
  init({ languages: ['it'] });
</script>

Notes:

  • The default (non‑lite) entry includes built‑in locale data for quick trials. Prefer the lite entry in production apps.
  • You can register multiple locales by calling registerLanguage(tag, data) more than once.

Advanced (CSS opt‑in per container):

You can enable the minor‑words preference declaratively on specific elements via CSS. This is useful for safe‑only languages where you want the behavior only in display contexts.

h1[lang="en"], h2[lang="en"] {
  --text-wrap-preferences: minor-words;
  --text-wrap-minor-threshold: 1; /* glue after 1‑letter tokens */
  --text-wrap-minor-stoplist: "of to in on at for by a I"; /* optional additions */
}

These custom properties are read when the preference is opted‑in on the element (or an ancestor) and the current language doesn’t have a built‑in minor‑words configuration.

Language defaults

  • Active by default: be, bg, ca, cs, el, es, fr, gl, hr, it, mk, pl, pt, ro, ru, sk, sl, sr, uk.
  • Neutral by default: da, de, en, lt, lv, nb, nl, nn, sv (only safe joins; no minor-words glue in body text).
  • The effective language is taken from lang (with fallback to the document root).

API

type InitOptions = {
  selector?: string;      // default: 'html' (scans under elements that compute to text-wrap: pretty)
  languages?: string[];   // pre-load specific BCP-47 primary subtags (e.g., ['it','en'])
  observe?: boolean;      // MutationObserver to process added/edited content
  context?: 'all'|'display'; // if 'display', only process headings/DT
};

Returns a controller { process(root?), disconnect() }.

Configuration

  • The library reads lang to select language defaults.
  • Neutral languages (e.g., en, de, nl) do not enable minor-words glue by default; only safe joins apply.
  • For display-only processing, pass { context: 'display' }.
  • You can pre-load language data via languages: ['it','fr'] to avoid first-use compile cost.

CSS preference gate and overrides:

  • You can opt in/out declaratively per container with --text-wrap-preferences: minor-words | none. On browsers without text-wrap: pretty, authors can set the preference under @supports not (text-wrap: pretty).
  • When the preference is active and the current language has no built‑in minorWords, the engine reads optional overrides:
    • --text-wrap-minor-threshold: <number> (glue after tokens up to N chars; typical value: 1)
    • --text-wrap-minor-stoplist: "space-separated tokens" (per‑container additions)

Example (headings in English):

h1[lang="en"], h2[lang="en"] {
  --text-wrap-preferences: minor-words;
  --text-wrap-minor-threshold: 1;
  --text-wrap-minor-stoplist: "of to in on at for by a I";
}

Performance & constraints

  • One TreeWalker pass over text nodes under elements that compute to text-wrap: pretty.
  • No layout measurements; O(n) string replacements; NBSP insertions are idempotent.
  • Skips pre, code, kbd, samp, script, style, textarea, math, svg, [contenteditable] and basic URL/email-like text.
  • Does not cross inline elements by default.

Browser support

  • The polyfill acts only where the computed style is text-wrap: pretty. On browsers without support, it effectively no-ops unless the author explicitly applies an opt-in selector targeting the same blocks.
  • The library itself targets modern evergreen browsers (ES2020, Intl.Segmenter optional).

Contributing language data

  • Language tables live in src/data/languages/<lang>.json.
  • To propose additions:
    1. Add or edit the JSON with minorWords (threshold + list) and lexical categories (labels, honorifics, abbrCompounds).
    2. Add unit tests in tests/engine.spec.ts (or a new spec file) with input → expected output.
    3. Run npm run test.

Tests & benchmarks

  • Run tests: npm run test
  • Run a simple throughput benchmark: npm run bench

Limitations

  • No apostrophe/elision handling (out of scope for now).
  • Does not measure layout; it applies static glues consistent with editorial conventions.
  • Does not cross inline elements unless an advanced option is introduced in future.

Standards context

  • This repo accompanies the explainer (explainer.md) that proposes text-wrap-preferences: minor-words as an additive, language-sensitive author preference for paragraph-aware wrapping. For a consolidated list of safe labels/honorifics used by the polyfill, see docs/LEXICON.md.

License

MIT

About

Polyfill/extension for CSS text-wrap: pretty to bias against breaks after minor words and apply safe typographic joins.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors