CrashLoopBackOff on Kubernetes: DuckDB WAL replay fails on fresh database initialization

## Environment                                                                                                                                                

  - **Liwan version:** 1.4 (image `ghcr.io/explodingcamera/liwan:1.4`, SHA `sha256:80c696af40b84abb1a008ee307e297a7722c178e654ab88dee5e953e2f93f661`)           
  - **DuckDB version:** 1.5.0 (bundled in liwan 1.4)
  - **Kubernetes:** k3s v1.33.6+k3s1                                                                                                                            
  - **Container runtime:** containerd 2.1.5-k3s1.33                                                                                                           
  - **Host OS:** Debian GNU/Linux 13 (trixie)                                                                                                                   
  - **Kernel:** 6.12.38+deb13-cloud-amd64                                                                                                                       
  - **CPU:** AMD EPYC with AVX2 support
  - **Filesystem:** ext4                                                                                                                                        
  - **Storage:** local-path provisioner (k3s default), bind-mount into pod                                                                                    
                                                                                                                                                                
  ## Problem
                                                                                                                                                                
  Liwan crashes immediately (<1 second) on every startup inside a Kubernetes pod,                                                                             
  even on a completely clean `/data` directory. The error is:

  WARN liwan::app::db: Failed to create DuckDB connection. If you've just upgraded                                                                              
  to Liwan 1.2, please downgrade to version 1.1.1 first, start and stop the server,
  and then upgrade to 1.2 again.                                                                                                                                
  Error: Failed to create DuckDB connection: INTERNAL Error: Failure while replaying                                                                          
  WAL file "/data/liwan-events.duckdb.wal": Calling DatabaseManager::GetDefaultDatabase                                                                         
  with no default database set
  This error signals an assertion failure within DuckDB.                                                                                                        
                                                                                                                                                              
  DuckDB creates the WAL file on fresh init, then immediately fails to replay it as
  part of the initial checkpoint. The result is a CrashLoopBackOff with a deterministic                                                                         
  4494-byte WAL file created every time.
                                                                                                                                                                
  ## What We Tested                                                                                                                                           

  | Scenario | Result |                                                                                                                                         
  |---|---|
  | `ctr run` with overlay filesystem (no volume) | ✅ Works |                                                                                                  
  | `ctr run` with bind-mount to the same PVC path (clean dir) | ✅ Works |                                                                                   
  | Kubernetes pod, clean PVC, default security context | ❌ Crashes |
  | Kubernetes pod + `seccompProfile: Unconfined` | ❌ Crashes |                                                                                                
  | Kubernetes pod + `capabilities: add: ["ALL"]` | ❌ Crashes |
  | Kubernetes pod + `runAsUser: 0` | ❌ Crashes |                                                                                                              
  | Kubernetes pod + memory limit 512Mi | ❌ Crashes |                                                                                                          
  | Kubernetes pod + `LIWAN_DUCKDB_THREADS=1` | ❌ Crashes |
                                                                                                                                                                
  The crash is deterministic and always produces the same 4494-byte WAL file before failing.                                                                  
                                                                                                                                                                
  ## Additional Findings                                                                                                                                      

  - `LIWAN_LISTEN` cannot be set via environment variable — causes `Error: duplicate field
    'listen'`, suggesting the distroless image includes a TOML config with `listen` already set.
  - `LIWAN_BASE_URL` must point to a resolvable domain before the pod starts, otherwise liwan                                                                   
    panics with `failed to lookup address information: Name does not resolve` (src/web/mod.rs:143).
  - DuckDB 1.5.1 release notes mention a fix for _"WAL corruption related to                                                                                    
    MarkBlockAsCheckpointed on fresh database initialization"_, which matches this exact failure.                                                             
    Liwan 1.4 bundles DuckDB 1.5.0.                                                                                                                             
                                                                                                                                                              
  ## Suspected Root Cause                                                                                                                                       
                                                                                                                                                              
  DuckDB 1.5.0 bug in fresh database initialization, fixed in DuckDB 1.5.1. The bug surfaces                                                                    
  specifically in the Kubernetes pod execution context (possibly related to cgroup constraints or
  subtle runtime differences vs. bare `ctr run`). Upgrading the bundled DuckDB to ≥1.5.1 would                                                                  
  likely fix the issue.                                                                                                                                         
   
  ## Workaround                                                                                                                                                 
                                                                                                                                                              
  None found. The application is currently not usable on Kubernetes with liwan 1.4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CrashLoopBackOff on Kubernetes: DuckDB WAL replay fails on fresh database initialization #36

Environment

Problem

What We Tested

Additional Findings

Suspected Root Cause

Workaround

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Scenario	Result
`ctr run` with overlay filesystem (no volume)	✅ Works
`ctr run` with bind-mount to the same PVC path (clean dir)	✅ Works
Kubernetes pod, clean PVC, default security context	❌ Crashes
Kubernetes pod + `seccompProfile: Unconfined`	❌ Crashes
Kubernetes pod + `capabilities: add: ["ALL"]`	❌ Crashes
Kubernetes pod + `runAsUser: 0`	❌ Crashes
Kubernetes pod + memory limit 512Mi	❌ Crashes
Kubernetes pod + `LIWAN_DUCKDB_THREADS=1`	❌ Crashes

CrashLoopBackOff on Kubernetes: DuckDB WAL replay fails on fresh database initialization #36

Description

Environment

Problem

What We Tested

Additional Findings

Suspected Root Cause

Workaround

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions