Describe the bug
Fluent Bit emits incomplete (split) log records during container log file rotation managed by containerd.
When containerd splits a log record across two files at rotation time,
Fluent Bit forwards each fragment as a separate log record instead of joining them.
The result is malformed JSON records with missing fields that arrive at the destination silently broken.
The bug appears only under high load, since only in this case containerd splits a log record across two files.
Over 1 hour of testing at 10k logs/sec from 2 Pods, Fluent Bit produced 34 split records.
To Reproduce
- Clone the benchmark repository:
git clone https://github.com/VictoriaMetrics/log-collectors-benchmark
cd log-collectors-benchmark
- Create a
kind Kubernetes cluster (requires kubectl, kind, helm, docker, make):
kind create cluster --name log-collectors-bench
- Install VictoriaLogs as the log storage backend:
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm install vls vm/victoria-logs-single --namespace logging --create-namespace
- Configure Fluent Bit to write to VictoriaLogs:
make set-endpoint VLS_HOST='vls-victoria-logs-single-server.logging.svc.cluster.local' VLS_PORT=9428
- Deploy Fluent Bit:
- Start the load generator:
make bench-up-generator GENERATOR_REPLICAS=1 LOGS_PER_SECOND=10000 RAMP_UP=false
You can increase the number of the load generator replicas (GENERATOR_REPLICAS) to greater value if your machine is fast enough.
This will increase the load and a chance to reproduce the bug.
- Forward the VictoriaLogs port to your local machine:
kubectl port-forward -n logging vls-victoria-logs-single-server-0 9428:9428
-
Wait approximately 30 minutes (the bug is intermittent and appears only under sustained load).
-
Query VictoriaLogs for malformed records using the expression sequence_id:"" -
this finds all records missing the sequence_id field, which are the split fragments.
-
Clean up:
Expected behavior
Fluent Bit should detect that a log record was split at a file rotation boundary and reconstruct the complete record before forwarding it.
Screenshots
Your Environment
Additional context
The root cause appears to be specific to the last log record of a file at rotation time.
The record is split across two files by containerd and is marked with the partial flag (P in CRI format),
even though its size does not exceed the standard 16 KiB threshold at which containerd normally splits long lines.
Fluent Bit forwards each part as a separate record instead of waiting for and joining the continuation from the new file.
We custom-modified our collector to verify that the issue is rotation-specific. Since other collectors don't encounter this, we've confirmed the application is writing logs properly and isn't the source of truncated or partial log lines.
Describe the bug
Fluent Bit emits incomplete (split) log records during container log file rotation managed by
containerd.When
containerdsplits a log record across two files at rotation time,Fluent Bit forwards each fragment as a separate log record instead of joining them.
The result is malformed JSON records with missing fields that arrive at the destination silently broken.
The bug appears only under high load, since only in this case
containerdsplits a log record across two files.Over 1 hour of testing at 10k logs/sec from 2 Pods, Fluent Bit produced 34 split records.
To Reproduce
kindKubernetes cluster (requireskubectl,kind,helm,docker,make):You can increase the number of the load generator replicas (
GENERATOR_REPLICAS) to greater value if your machine is fast enough.This will increase the load and a chance to reproduce the bug.
Wait approximately 30 minutes (the bug is intermittent and appears only under sustained load).
Query VictoriaLogs for malformed records using the expression
sequence_id:""-this finds all records missing the
sequence_idfield, which are the split fragments.Clean up:
Expected behavior
Fluent Bit should detect that a log record was split at a file rotation boundary and reconstruct the complete record before forwarding it.
Screenshots
Your Environment
kindv0.31.0), single-node clustern2-highcpu-32(32 vCPU, 32 GiB RAM, local SSD)tailinput plugin reading from/var/log/pods, JSON parserAdditional context
The root cause appears to be specific to the last log record of a file at rotation time.
The record is split across two files by
containerdand is marked with the partial flag (Pin CRI format),even though its size does not exceed the standard 16 KiB threshold at which
containerdnormally splits long lines.Fluent Bit forwards each part as a separate record instead of waiting for and joining the continuation from the new file.
We custom-modified our collector to verify that the issue is rotation-specific. Since other collectors don't encounter this, we've confirmed the application is writing logs properly and isn't the source of truncated or partial log lines.