Skip to content

Commit 762919c

Browse files
committed
add riscv drop post
1 parent 06f9741 commit 762919c

1 file changed

Lines changed: 128 additions & 0 deletions

File tree

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
---
2+
title: Dropping RISC-V support
3+
layout: post
4+
excerpt_separator: <!--more-->
5+
---
6+
7+
The next set of images will drop RISC-V support. The builder is
8+
currently still going but within the next few days it will stop,
9+
and the repositories will stay in place but frozen.
10+
11+
Nothing will change in packaging (the build profile will remain,
12+
template support where present will remain, cross-toolchains will
13+
remain) but there will be no more updates to the repo for the
14+
foreseeable future.
15+
16+
<!--more-->
17+
18+
## The situation
19+
20+
The initial plumbing for RISC-V was added in the distro in July 2021
21+
and repos later in the year, i.e. it has been there almost from the
22+
start. During all this time, the builds have been supported by doing
23+
so on an x86_64 machine with `qemu-user` binfmt emulation coupled with
24+
transparent `cbuild` support for this.
25+
26+
The reason for doing it this way was that there wasn't any hardware
27+
we could use for performance reasons; I had obtained a SiFive HiFive
28+
Unmatched board in October 2021 and this proved to be useless for
29+
builds as the performance of this board is similar to Raspberry Pi 3.
30+
Other boards came later, but none of them improved on that front
31+
significantly enough.
32+
33+
This was expected to be a temporary state that would resolve itself
34+
within 2-3 year time; it is Q1 2025, and the options are the following:
35+
36+
* HiFive P550 that was released recently has performance similar to
37+
Raspberry Pi 4 and is unsuitable for the task; this board was originally
38+
supposed to be released several years ago as part of the SiFive and Intel
39+
collaboration (Horse Creek) but now got released with a Chinese SoC instead
40+
* Milk-V Pioneer is a board with 64 out-of-order cores; it is the only of
41+
its kind, with the cores being supposedly similar to something like ARM
42+
Cortex-A72. This would be enough in theory, however these boards are hard
43+
to get here (especially with Sophgon having some trouble, new US sanctions,
44+
and Mouser pulling all the Milk-V products) and from the information that
45+
is available to me, it is rather unstable, receives very little support,
46+
and is ridden with various hardware problems.
47+
* Things based on Spacemit K1 (e.g. Milk-V Jupiter) have an 8-core SoC that
48+
is technically an out-of-order design, but in practice the per-core
49+
performance is reportedly even worse than the JH7110, so it is unsuitable.
50+
* Boards based on JH7110 (e.g. VisionFive 2, the new Framework board etc.)
51+
utilitze 4 U74 cores (same configuration as my HiFive unmatched) that are
52+
simple in-order designs and therefore are unsuitable (similar to RPi3).
53+
* My HiFive Unmatched, which is the same situation as above.
54+
* Other available cores are usually much worse than any of the above.
55+
56+
The promising option (Milk-V Oasis with 16 SiFive P670 cores) that was
57+
first announced in 2023 ultimately ended up being canned due to issues
58+
the SoC vendor has, and nobody has ever seen a single production chip,
59+
let alone a board. As far as I can tell, no other options are coming up.
60+
61+
It is unsustainable to stick with the current situation with the emulator.
62+
Doing so has numerous problems:
63+
64+
* We could never actually run tests on the packages being built, because
65+
the emulator is unreliable and will result in false positive failures.
66+
Disabling stuff conditionally for RISC-V is not a viable option because
67+
they are not RISC-V issues and will always happen with emulation, so
68+
all the RISC-V packages were being built without tests.
69+
* It is very slow, being by far the slowest builder in our fleet. It is
70+
still several times faster than e.g. the JH7110 would build things. The
71+
performance is actually rather variable; things that can parallelize
72+
really well run at a fairly reasonable speed due to being able to spawn
73+
many emulators, while things like configure scripts that are single
74+
thread and fork a lot run very slowly. Either way, overall, it is much
75+
slower than any of the other builders, despite RISC-V being until the
76+
introduction of LoongArch64 builds the only architecture with no LTO.
77+
* Most importantly, it is unreliable. The `qemu` emulator likes to hang
78+
during various workloads, with the emulator going into sleep state and
79+
remaining there forever. When that happens, the builds have to be
80+
manually canceled and restarted (it is not deterministic). This used
81+
to be worse before before some fixes, but even with latest version of
82+
the emulator it still happens, particularly during Go builds (since
83+
we rebuild every Go program upon toolchain updates for secfixes,
84+
any such rebuild can require many manual cancelations and restarts).
85+
* It burns a ton of power for how slow it is, because it fully loads
86+
a beefy x86 machine, and I'm not happy at all about that.
87+
88+
At this point, to have a relatively sustainable base, we'd need a board
89+
that is at least as powerful as Raspberry Pi 5. This would still make
90+
the slowest builder in the fleet, but it would likely be faster than
91+
the current emulation arrangement while also being more reliable.
92+
93+
However, the industry does not seem to be interested in producing such
94+
machines and for most part focuses on embedded (low-end) as well as
95+
things entirely irrelevant to a distro (AI/NPU etc.) that do not help
96+
at all; at this point I don't think we can wait any longer, especially
97+
as no remedy has been announced.
98+
99+
We have no such problem with the other architectures; obviously x86 and
100+
ARM are at this point mainstream and this does not surprise anyone, but
101+
even the likes of LoongArch have perfectly acceptable hardware (not the
102+
fastest, but also not a bottleneck) that performs reliably.
103+
104+
## Will RISC-V support be reintroduced?
105+
106+
If acceptable build hardware is released and is reasonably available to
107+
us, the architecture will be reintroduced.
108+
109+
If that happens, the repositories will be rebuilt from scratch, as if
110+
a new architecture, with a process similar to how it was recently done
111+
with LoongArch64. It will be a tier-2 architecture with enforced tests
112+
and without LTO just like LoongArch64.
113+
114+
However, whether or when that will happen is currently a big unknown
115+
due to such hardware not existing and nothing being even announced.
116+
117+
Nothing will change in the other architecture support. The new tier
118+
list will be:
119+
120+
* Tier 1 for `aarch64`, `ppc64le`, and `x86_64`
121+
* Tier 2 for `loongarch64`
122+
* Tier 3 for `ppc64` and `ppc`
123+
124+
There is also some chance of ARMv7 and ARMv6 32-bit repositories being
125+
introduced in the next few months' timeframe, as we may be moving to
126+
an oversized Ampere Altra machine for all ARM builds (right now AArch64
127+
is served by a Hetzner Cloud VM and can't take any more load). This is
128+
yet not set in stone, however.

0 commit comments

Comments
 (0)