Skip to content

emscripten: add Docker-based one-stop build and polish demo page#6385

Open
ChrisJefferson wants to merge 2 commits into
gap-system:masterfrom
ChrisJefferson:emscripten-polish
Open

emscripten: add Docker-based one-stop build and polish demo page#6385
ChrisJefferson wants to merge 2 commits into
gap-system:masterfrom
ChrisJefferson:emscripten-polish

Conversation

@ChrisJefferson
Copy link
Copy Markdown
Contributor

Reworks the emscripten build to be reproducible inside a pinned emsdk:3.1.23 container, replacing the previous mix of shell, Ruby, and Node helpers. New entry point: etc/emscripten/build-in-docker.sh builds the image, runs the wasm build inside it, and assembles a self-contained web-example/ directory copyable to any static host.

This doesn't change functionality, but it (hopefully) makes it clearer what is going on, makes the created page nicer, and cleans up the docker script. You can now just run etc/emscripten/build-in-docker.sh and get a full finished build ready to deploy.

@wangyenshu
Copy link
Copy Markdown
Contributor

That is great! I have a small change request. Could you let the startup_manifest.json include the manual files, as requested in #6269 (comment)? You could do it by typing the following command in gap wasm website to record it in startup_manifest.json:

SizeScreen([100000, 100000]);
??a

or use external tools to add paths of all .six files.

@fingolfin
Copy link
Copy Markdown
Member

ping @ChrisJefferson

Comment thread etc/emscripten/web-template/index.html Outdated
Comment thread etc/emscripten/web-template/index.html Outdated
Comment thread etc/emscripten/Dockerfile Outdated
@fingolfin
Copy link
Copy Markdown
Member

@ChrisJefferson was AI used to assist in creating this PR? If so it would be good to acknowledge it (and perhaps the AI can be instructed to head our AGENTS.md file in the future)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... the commit message states the following, as if it was a benefit:

startup_manifest.json captured from a real GAP run and committed.

Removes: build_startup_manifest.js

But to me this seems like a step backwards: startup_manifest.json is a generated file, so shouldn't be in the repo. Worse, though, the code for updating it was removed. Surely the list of files in there will evolve over time?

Copy link
Copy Markdown
Contributor Author

@ChrisJefferson ChrisJefferson May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code to regenerate it was very fragile, and I had trouble making it work. The new code works fine, but requires you cut+paste it into a browser javascript window.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new startup_manifest.json generation script is integrated into gap-fs.js and the web-template files. I tested this by serving my prebuilt WebAssembly GAP with the new web-templates via serve.py and following the new README. This is the startup_manifest.json I get: startup_manifest.json. It has no .log file and includes the .six manuals.

I just realized that my version of WebAssembly GAP is trying to request these non-existent manual files:
pkg/patternclass/doc/manual.six
pkg/edim/doc/manual.six
pkg/smallantimagmas/doc/manual.six. If there are no big changes to build.sh, this could still be an issue. I haven't figured out why this happened.

The coi-serviceworker.js README notes that it works on both localhost and static hosting via HTTPS. I verified this locally. So the blocks in serve.py that adds the COOP and COEP header are harmless but not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've now improved this (let me know what you think)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is great. Thanks for the update.

Comment thread etc/emscripten/startup_manifest.json Outdated
@wangyenshu
Copy link
Copy Markdown
Contributor

I am thinking about enhancing gap-fs.js to support loading custom scripts via a specified path. Suppose the main files are hosted on a CDN such as GitHub or jsDelivr; this would allow users to embed a gap shell directly on their websites to run a custom script. This could be great for running an interactive demo for gap code or packages.

@ChrisJefferson
Copy link
Copy Markdown
Contributor Author

Cleaned up the code a bit, but I thought I'd tell a little story about versions (this discussion was writen by AI, but proof-read by me, by going through the long process of trying to get things updated).

Why this stays on emscripten 3.1.23

The old Dockerfile pinned emsdk:3.1.23 with a comment amounting to "newer emscripten breaks GAP". The symptom was real; the cause was not what the comment implied. Two separate things are going on.

1. A real GASMAN bug (fixed here)

Built against a modern emsdk (I tried 5.0.7), GAP links and starts, then aborts during library loading — RuntimeError: index out of bounds, or with assertions, corrupted its heap memory area (address zero). A write through a stale pointer.

GASMAN scans the stack conservatively, and on wasm reaches register-resident pointers via emscripten_scan_stack/emscripten_scan_registers. Those calls were guarded by #ifdef EMSCRIPTEN. Old emscripten defined the bare EMSCRIPTEN; modern emscripten only defines __EMSCRIPTEN__. So the register scan compiled on 3.1.23 and silently vanished on anything newer — GASMAN missed roots, freed live bags, corrupted the heap. The 3.1.23 pin was masking a miscompiled #ifdef.

Fix is one line in src/gasman.c: EMSCRIPTEN__EMSCRIPTEN__. With it, GAP on 5.0.7 reaches a prompt, survives GC-heavy loads, and passes the test suite. This is a latent correctness bug worth fixing on any toolchain.

2. The actual blocker: the terminal

With GAP itself happy on 5.0.7, the REPL double-echoes every line: GAP's line editor echoes each char (thinks it's raw) while xterm-pty's line discipline also echoes on newline (still cooked). I instrumented syStartraw (src/sysfiles.c) to read termios back after tcsetattr:

after tcsetattr c_lflag=0x8a3b (ECHO=1 ICANON=1) -- want both 0

tcsetattr succeeds but changes nothing: the ioctl never reaches xterm-pty. xterm-pty hooks in via emscriptenHack(), which patches emscripten's internal TTY device — explicitly "dependent on the internal implementation of the Emscripten runtime". Those internals moved, so TCSETS falls through to emscripten's default TTY (which ignores termios).

Bumping xterm-pty doesn't help — the versions line up wrong:

xterm-pty our worker API (emscriptenHack/TtyClient/TtyServer) xterm
0.9.4–0.9.6 yes 4.x
0.10.x removed 4.x
0.11+/0.12 removed 5.x

The worker integration was dropped at 0.10.0, and even 0.12 only claims 3.1.47. There is no xterm-pty that keeps our integration and targets a modern emscripten; moving forward means rewriting the terminal front end (new API, xterm 4→5) onto an untested combination, for little gain on a pinned build-time toolchain producing a static site.

Decision

Stay on emsdk 3.1.23 + xterm-pty 0.9.4 — the only known-good pairing — but now with an accurate Dockerfile explanation, and keep the GASMAN fix (correct on any toolchain).

Also here (toolchain-independent build fixes)

  • zlib --static (newer wasm-ld rejects the shared .so test targets; GAP only needs libz.a).
  • build.sh removes generated src/c_*.c/ffdata.* before the native build so reruns regenerate instead of failing.
  • assemble-website.sh: copy resiliently (dangling .dSYM symlinks in some pkgs aborted the copy → startup 404s), drop pkg/log/, ship gap.worker.js only if produced, assert lib/init.g lands.

AI disclosure

Claude Code (Opus 4.8) diagnosed both issues and made the edits; credited as co-author per AGENTS.md.

ChrisJefferson and others added 2 commits May 31, 2026 16:23
Rework the emscripten build into a reproducible flow inside a pinned
container: etc/emscripten/build-in-docker.sh builds the image, runs the
wasm build, and assembles a self-contained web-example/ that can be copied
to any static host. Replaces the previous mix of shell, Ruby and Node
helpers.

The toolchain is pinned to emsdk 3.1.23 because both halves of the REPL are
sensitive to the exact version: GASMAN's conservative GC, and xterm-pty's
emscriptenHack() patching emscripten's TTY device so tcsetattr() reaches
the line discipline. 3.1.23 is the version that works with the pinned
xterm-pty 0.9.4; a newer emsdk leaves tcsetattr unhonoured and double-
echoes typed input. See the Dockerfile for the full rationale.

Build robustness fixes, independent of the toolchain version:
 - configure zlib with --static (newer wasm-ld rejects its shared .so test
   targets; GAP only needs libz.a);
 - remove the generated src/c_*.c and ffdata.* before the native build so
   reruns regenerate them instead of failing;
 - assemble-website.sh copies the data tree resiliently (some packages
   ship dangling .dSYM symlinks that aborted the copy and left lib/, grp/
   uncopied, causing startup 404s), drops pkg/log/ test logs, ships
   gap.worker.js only if the build produced one, and asserts lib/init.g
   landed in the output.

Also fix the outdated documentation links in the demo page and refresh
startup_manifest.json.

AI assistance: Claude Code (Opus 4.8) was used to make these changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
GASMAN's emscripten_scan_stack/emscripten_scan_registers calls -- which
spill pointers held in wasm registers so the conservative collector can
see them -- were guarded by `#ifdef EMSCRIPTEN`. Older emscripten defined
the bare `EMSCRIPTEN` macro, so the block compiled and worked; modern
emscripten removed it in favour of `__EMSCRIPTEN__`, which silently
disabled the block. Without the register scan, the fallback stack scan
misses roots that live only in registers, so live bags are freed and the
heap is corrupted (observed as "index out of bounds" / "corrupted its heap
memory area" aborts on a recent emsdk). Update the guard to the modern
macro.

AI assistance: Claude Code (Opus 4.8) diagnosed and made this change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants