Skip to content

Live bootstrap to guix#578

Draft
vxtls wants to merge 258 commits intofosslinux:masterfrom
vxtls:live-bootstrap-to-guix
Draft

Live bootstrap to guix#578
vxtls wants to merge 258 commits intofosslinux:masterfrom
vxtls:live-bootstrap-to-guix

Conversation

@vxtls
Copy link
Copy Markdown

@vxtls vxtls commented Mar 15, 2026

No description provided.

vxtls added 30 commits February 21, 2026 11:07
When booting with --stage0-image, mirror ports can change between runs
(e.g. file:// -> transient SimpleMirror port), but the reused image kept
stale MIRRORS/MIRRORS_LEN values in /steps/bootstrap.cfg.

Update stage0-work image preparation to patch bootstrap.cfg on each run:
- rewrite MIRRORS and MIRRORS_LEN from current CLI mirrors
- keep existing --build-guix-also handoff checks/sync behavior

This fixes guest downloads trying old 10.0.2.2:<stale-port> endpoints
during steps-guix builds.
…stsuite

argp-standalone pass1 builds in a separate build directory. Its testsuite
compiles sources that include <argp.h>, but without an explicit include path
the header in the source root is not found and build fails.

Set:
- CPPFLAGS=-I/Users/luoyanpan/CLionProjects/guix/live-bootstrap/..

in src_configure so testsuite objects can resolve argp.h during the normal
 phase.
…al LIBS and setting host/build + kernel-toolchain env
…hs, and disable unused-but-set-variable as error
feat(steps-guix): add libgcrypt-1.12.1 default build with gcc-detected host and pkg-config path
feat(steps-guix): add guile-gcrypt-0.5.0 with dynamic libgcrypt prefix and ld library path
@vxtls
Copy link
Copy Markdown
Author

vxtls commented Apr 4, 2026

Managed to reproduce the glibc-headers-mesboot-2.16.0 issue - bootar's xz implementation is failing to decompress glibc's tarball: ERROR: In procedure bytevector-copy!: Value out of range: 975153

it's likely still a bootstrap Guile issue? Is this a problem caused by using i686-linux bootstrap guix on guix x86_64-linux target?

@Googulator
Copy link
Copy Markdown
Collaborator

i686-linux Guile is correct for bootstrapping x86_64 Guix - the bootstrap sequence includes an explicit cross-compiler step to move from i686 to x86_64. IMO it's more likely to be either a musl vs. glibc issue, or something caused by our too-new GCC (the original bootstrap-guile was built using GCC 4.7 or 4.8 - it's not entirely clear; both versions are referenced in the binary).

One option would be to build native coreutils & compression tools earlier (maybe also a real Bash), as that not only takes responsibilities off of bootstrap-guile, avoiding future issues like this, but it would also greatly speed up the build, as most of the time is currently spent starting up instances of bootstrap-guile for each invocation of "test" in shell scripts. (Coreutils "test" is 100x faster on my bootstrap rig than its gash-utils reimplementation, and the real Bash has a built-in "test" command which is 100x faster than coreutils, making "test" operations effectively instantaneous.)

@vxtls
Copy link
Copy Markdown
Author

vxtls commented Apr 4, 2026

You're absolutely right. However, my question is, when using the official Guix image, running the same command doesn't seem to cause any issues. Is this really a problem with glibc and musl? If that's the case, I think we need evidence to prove that this bug is caused by differences in libc, or we just give it a try, build a glibc for guile and its dependencies, if that solve problem, then it prove that it's because of the libc difference

But I should also mention that I encountered the same issue when I compiled Guile using glibc by hand last time, uses a 64-bit toolchain, glibc 2.34, and guile 2.2, idk if I change to a older glibc and 32-bit toolchain can solve the problem

The complete log from that time:
The solution at the time was to change the default compression format from xz to gz, which successfully circumvented the issue; however, I don’t believe this addressed the root cause, essentially, it was just a workaround.

guix-pull-2.log.txt.zip
basicly the same issue, so i don't think that problem cause by a libc difference

So, The current evidence does not support the idea that the behavioral differences are caused by libc variations. It might be due to GCC, since I was also using a very recent version of GCC at the time, which is GCC 13. I'd still like to know exactly how to solve this problem

And also, based on the guix code, it actually use different guile version for x86_64-linux & i686-linux

https://alpha.gnu.org/gnu/guix/bootstrap/x86_64-linux/20131110/guile-2.0.9.tar.xz
https://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz

The hash values are different. When I run file on them, one is an x86_64 binary and the other is an 80386 binary. However, I don't think this will cause bootar to fail to extract.

It is worth noting that this package is not downloaded directly from a remote server; instead, after downloading and extracting it, it is repackaged using an existing xz binary (following a “patch + repack” process). Therefore, is the version of xz a key consideration? For example, we are currently using version 5.2.4, but in guix it is use 5.0.4, this can be a difference. This might be a very viable option, since changes to the xz algorithm have made it impossible for bootar to decompress it.

roy@guix-test /gnu/store$ ./qc9b01x31ayxb36r0zw5cw28awisdq98-xz --version
xz (XZ Utils) 5.0.4
liblzma 5.0.4
roy@guix-test /gnu/store$

EDIT: I'm trying out xz 5.0.6 to see if it solves the problem.

@Googulator
Copy link
Copy Markdown
Collaborator

Googulator commented Apr 6, 2026

32- vs 64-bit bootstrap-guile could be relevant as well, indeed.

Meanwhile, I've managed to get all the way to gcc-cross-boot0 (so far, build still ongoing) with this patch applied to Guix:
build-native-utils-earlier.patch

As a nice side effect, it also makes builds much faster once the new native binaries are built and used.

EDIT: I had to unmount /tmp earlier, as it filled up the RAM on my bootstrap machine during one of the gcc builds.

@Googulator
Copy link
Copy Markdown
Collaborator

When building the final gawk in the bootstrap, I encountered this: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51286

Seems to be a known, sporadic issue. As expected, fixed on a 2nd try.

Suggestion: patch https://codeberg.org/guix/guix/src/tag/v1.5.0/gnu/packages/commencement.scm#L3573 to disable tests.

@Googulator
Copy link
Copy Markdown
Collaborator

Got through the bootstrap proper (had to restart once because one of guile-3.0.9's checks locked up - AFAIK this is a known sporadic issue, I seem to remember seeing it before when trying to bootstrap Guix on various platforms), but then immediately after "Computing Guix derivation", as it tries to switch from the "installed" Guix to the one in local-channels, it dies with the familiar system() "No such file or directory" error.

The fix is simple: apply remove-environment-variables-system-call.patch also in guix-daemon-and-pull.sh. With that fixed, the build is now proceeding agian.

@Googulator
Copy link
Copy Markdown
Collaborator

One more thing: you may want to pass -c ${JOBS} to the guix pull & guix system image commands, so the builds use all cores available. (${JOBS} comes from /steps/bootstrap.cfg.)

@vxtls
Copy link
Copy Markdown
Author

vxtls commented Apr 6, 2026

Got through the bootstrap proper (had to restart once because one of guile-3.0.9's checks locked up - AFAIK this is a known sporadic issue, I seem to remember seeing it before when trying to bootstrap Guix on various platforms), but then immediately after "Computing Guix derivation", as it tries to switch from the "installed" Guix to the one in local-channels, it dies with the familiar system() "No such file or directory" error.

The fix is simple: apply remove-environment-variables-system-call.patch also in guix-daemon-and-pull.sh. With that fixed, the build is now proceeding agian.

So, has this issue been resolved or not?

@Googulator
Copy link
Copy Markdown
Collaborator

Resolved, yes. Now, there's a new issue: one of the tests for util-linux, related to user namespace handling, is failing. The cause: CONFIG_USER_NS disabled in the kernel configuration. I'm gonna set it to Y, and try again.

@Googulator
Copy link
Copy Markdown
Collaborator

With CONFIG_USER_NS=y in kconfig, util-linux's tests now pass, and Guix moves on to building Valgrind.

@vxtls
Copy link
Copy Markdown
Author

vxtls commented Apr 6, 2026

With CONFIG_USER_NS=y in kconfig, util-linux's tests now pass, and Guix moves on to building Valgrind.

Wait, what stage are you at right now? Are you on guix pull or guix system image?

@Googulator
Copy link
Copy Markdown
Collaborator

guix pull still, but past the initial bootstrap AKA commencement.scm

@Googulator
Copy link
Copy Markdown
Collaborator

Googulator commented Apr 8, 2026

Ran into and debugged another issue: when building guix-manual, I get the following error:

In unknown file:
           0 (copy-file "/gnu/store/1n4lagn25hylvrn9x9v2qjf0r0dj9sby-doc/os-config-desktop.texi" "./os-config-desktop.texi")

ERROR: In procedure copy-file:
In procedure copy-file: Permission denied

The cause: when this call is executed, ./os-config-desktop.texi already exists in the build directory, with read-only permissions.
That copy is generated from /gnu/store/fcw6rj38k5g3cwhdqqr4yfwlhfjzr81q-examples/desktop.tmpl, which in turn comes from subdirectory gnu/system/examples/ of the local repository created in preparation for calling guix pull, using the contents of the Guix 1.5.0 tarball. The upstream Guix repository also contains this file. So this copy is fine.

However, /gnu/store/1n4lagn25hylvrn9x9v2qjf0r0dj9sby-doc, on a normal Guix installation, contains no os-config-desktop.texi. Normally, the Guix channel being pulled is a fork of Guix's upstream repository in one way or the other, and that repository has an explicit gitignore set for this file. But in our case, we initialize the channel repository with the contents of the tarball, which contains some pregenerated documentation & other files, included for user convenience. And that happens to include a copy of os-config-desktop.texi in doc, which ends up conflicting with the one generated from desktop.tmpl.

Fix, option 1: in guix-daemon-and-pull.sh, insert rm doc/os-config*; rm doc/*.??*.*; rm doc/version*.texi; rm doc/stamp*; rm doc/*.1; rm doc/*.info; rm doc/guix.info*; rm doc/images/*.eps doc/images/*.pdf doc/images/bootstrap-*.png doc/images/coreutils-bag-graph.png doc/images/coreutils-graph.png doc/images/gcc-core-mesboot0-graph.png doc/images/service-graph.png doc/images/shepherd-graph.png before git init. (Ideally, do a more thorough cleanup of pregenerated files.)

Fix, option 2: Use a clone or snapshot of the Guix Git repository, rather than a release tarball intended for human use, to prepare the local channel. (Preferably I would also switch the actual build of the Guix package to be based on a Git repository, although live-bootstrap has a preference for release tarballs.)

For now, I have locally implemented option 1, and guix pull seems to be proceeding fine.

@Googulator
Copy link
Copy Markdown
Collaborator

Googulator commented Apr 8, 2026

Successful guix pull - moving on to ISO build.

EDIT: "disable-authentication: unrecognized option" - See Googulator/guix@ca0114e for a workaround; unfortunately this causes it to rerun the entire bootstrap :(

@Googulator
Copy link
Copy Markdown
Collaborator

Before the ISO build, one also needs to cp /var/guix/profiles/per-user/root/current-guix/manifest /usr/manifest - otherwise it won't be able to find the local channel.

@vxtls
Copy link
Copy Markdown
Author

vxtls commented Apr 8, 2026

cp /var/guix/profiles/per-user/root/current-guix/manifest /usr/manifest

just to make sure, what I need to do is add this before iso build in guix-build-iso.sh? copy it to the live-bootstrap's /usr dir?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants