Replies: 3 comments 7 replies
-
|
Hi, I'm not understanding the proposal here? |
Beta Was this translation helpful? Give feedback.
-
|
xterm currently has three Unicode add-ons: default 6, 11, and 15-g. 15-g handles things like 👨🌾 (ZWJ sequences) that can break rendering on the default provider, but it’s still parked as experimental. What I’m offering is a drop-in provider built from Unicode 17 tables (UAX #29 grapheme + width tables), with table generation owned by the provider, not pulled from a third party dump. There isn’t a clean width standard here; I aimed for wcwidth-lineage/backwards compatibility, and the ucs-detect-style runner makes it easy to compare 6/11/15-g/17 in one place and also against other terminals. Grapheme logic does introduce some interesting corner cases for wcwidth, and there are real diffs vs 15-g, which I’d be happy to review if useful. |
Beta Was this translation helpful? Give feedback.
-
|
Hey, hey. I didn't disappear. I’m still working on this, and I wanted to give an update. Here's a small demo that measures an app's Unicode compatibility tables: https://github.com/koal44/xterm.js/tree/share/width-explorer There are three widths measured here: COL, MOV, and DEL. COL is the traditional width that xterm.js uses today to stay aligned with the backend. DEL is the number of delete key presses required to eat through a payload, and MOV is the number of cursor movements required to pass over a payload. If we ever want to patch over the app's poor Unicode handling and support atomic grapheme editing at the terminal layer, we need these additional measurements. The demo can sweep the full Unicode set of about 1.1 million code points. Runtime varies widely from seconds to hours to days, depending on the shell and which measurements are enabled. Within a given environment, such as POSIX versus Windows, there is strong selective pressure for COL alignment. If column widths did not broadly line up, terminals would constantly break across apps, so that consistency does emerge in practice. Where things diverge much more is MOV and DEL behavior, which varies widely even when COL agrees. My candid opinion is that apps never should've attempted to handle Unicode themselves and just left it to us. We are now in the awkward position of not being able to query the backend for how it actually behaves, while still being expected to somehow infer its column alignment and related behavior after the fact. I'm not sure yet how we can reliably know which app we are interacting with, or whether we can safely sneak in measurements during an active session, but I'm assuming something along those lines is possible. I've been working on a proof of concept around this, with the basic idea of a new xterm.js add-on that swaps out the current parser's print handler and replaces it with an updated buffer line and cell data model. That work is not finished yet, and I got pulled into building the measurement demo first, but I'm continuing to explore it here: |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
Prototype Unicode addon wired into xterm:
https://github.com/koal44/xterm.js/tree/ucwidth-addon
Depends (temporarily) on:
https://github.com/koal44/uc-width
In
uc-width, running:executes a ucs-detect-style comparison that runs xterm's existing Unicode providers (6/11/15-grapheme) plus this one, in the same environment.
The xterm prototype currently has a temporary dependency on
uc-width. If you want to take this further, I'll restructure it to match whatever upstream approach you prefer (vendored tables, generated tables, etc.).If anything needs doing to make this fit better with what you want, just ask.
Beta Was this translation helpful? Give feedback.
All reactions