Skip to content

Commit 812ff6a

Browse files
committed
fix(install): set DM_DISABLE_UDEV=1 to prevent dm semaphore deadlock
`bootc install to-disk --block-setup tpm2-luks` hangs at `cryptsetup luksOpen` due to a libdevmapper udev cookie semaphore deadlock. libdevmapper creates SysV semaphores to synchronize with udevd, but the container's isolated IPC namespace prevents udevd (on the host) from seeing the semaphore. luksOpen blocks forever on semop(). Set DM_DISABLE_UDEV=1 before LUKS operations to skip udev synchronization for device-mapper. This is safe during installation -- there are no concurrent dm operations, and the kernel still creates device nodes without udev involvement. Partition device nodes and udev_settle() are unaffected. Confirmed with three independent tests: - Stock bootc, default podman: HANG - Stock bootc + DM_DISABLE_UDEV=1: PASS - Stock bootc + --ipc=host: PASS Fixes: #2089 Related: #421, #477 Signed-off-by: Andrew Dunn <andrew@dunn.dev>
1 parent adab93e commit 812ff6a

1 file changed

Lines changed: 32 additions & 0 deletions

File tree

crates/lib/src/install/baseline.rs

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,29 @@ pub(crate) fn wipefs(dev: &Utf8Path) -> Result<()> {
149149
.run_inherited_with_cmd_context()
150150
}
151151

152+
/// Disable libdevmapper's udev synchronization for device-mapper operations.
153+
///
154+
/// libdevmapper uses SysV semaphores ("udev cookies") to coordinate with udevd.
155+
/// Inside a container with an isolated IPC namespace (the podman/docker default),
156+
/// udevd on the host cannot see the container's semaphores, causing `cryptsetup
157+
/// luksOpen` and `luksClose` to deadlock on `semop()`.
158+
///
159+
/// Only affects device-mapper operations (LUKS open/close). Partition device
160+
/// nodes created by the kernel and managed by udev are not affected, so
161+
/// `udev_settle()` for partition discovery continues to work normally.
162+
///
163+
/// This is safe during installation as there are no concurrent device-mapper
164+
/// consumers.
165+
#[allow(unsafe_code)]
166+
fn disable_dm_udev_sync() {
167+
// SAFETY: bootc uses a single-threaded tokio runtime (new_current_thread).
168+
// No other threads can race on environment reads. The variable is only
169+
// inherited by child processes (cryptsetup, systemd-cryptenroll).
170+
unsafe {
171+
std::env::set_var("DM_DISABLE_UDEV", "1");
172+
}
173+
}
174+
152175
pub(crate) fn udev_settle() -> Result<()> {
153176
// There's a potential window after rereading the partition table where
154177
// udevd hasn't yet received updates from the kernel, settle will return
@@ -350,6 +373,15 @@ pub(crate) fn install_create_rootfs(
350373
let (rootdev_path, root_blockdev_kargs) = match block_setup {
351374
BlockSetup::Direct => (root_device.path(), None),
352375
BlockSetup::Tpm2Luks => {
376+
// Disable libdevmapper's udev synchronization. libdevmapper uses
377+
// SysV semaphores ("udev cookies") to coordinate device-mapper
378+
// operations with udevd. Inside a container with an isolated IPC
379+
// namespace (the podman/docker default), udevd on the host cannot
380+
// see the container's semaphores, causing cryptsetup luksOpen and
381+
// luksClose to deadlock on semop(). This is safe during
382+
// installation as there are no concurrent device-mapper consumers.
383+
disable_dm_udev_sync();
384+
353385
let uuid = uuid::Uuid::new_v4().to_string();
354386
// This will be replaced via --wipe-slot=all when binding to tpm below
355387
let dummy_passphrase = uuid::Uuid::new_v4().to_string();

0 commit comments

Comments
 (0)