Skip to content

msmtpd health checks creating runaway processes? #575

Description

@uktricky

Support guidelines

I've found a bug and checked that ...

  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

I've been running LibreNMS without issue for a number of months (years) in docker on my Raspberry Pi, but every so many weeks 5-6 the Pi starts to have issues and needs a reboot (I have a number of containers running on a 8gb Pi4). One thing I've noticed though trying to diagnose the problem is that LibreNMS msmptd checks are OK post restart but after a few days (3 or 4) suddenly I end up with loads of processes failing:

The healthcheck command succeeds when run manually:

docker exec librenms_msmtpd sh -c "echo EHLO localhost | nc 127.0.0.1 2500"

Output:
220 localhost ESMTP msmtpd
250 localhost

However, Docker reports repeated healthcheck timeouts and leaves nc/grep processes behind until the container is restarted.

Some logs:
ps -fp $(pgrep -d, nc | head -n1)
UID PID PPID C STIME TTY TIME CMD
root 6 2 0 Jun22 ? 00:00:00 [kworker/R-sync_wq]
root 84 2 0 Jun22 ? 00:00:00 [vchiq-sync/0]
root 487136 2115 0 20:51 ? 00:00:00 nc 127.0.0.1 2500
root 487251 2115 0 20:51 ? 00:00:00 nc 127.0.0.1 2500
root 487523 2115 0 20:52 ? 00:00:00 nc 127.0.0.1 2500
root 487693 2115 0 20:52 ? 00:00:00 nc 127.0.0.1 2500

docker ps -q | xargs docker inspect --format '{{.State.Pid}} {{.Name}}'
2115 /librenms_msmtpd

docker inspect librenms_msmtpd | grep -A10 -B5 -i health
"Pid": 2115, "ExitCode": 0, "Error": "", "StartedAt": "2026-06-22T06:21:25.073620786Z", "FinishedAt": "2026-06-22T06:21:18.872846142Z", "Health": { "Status": "unhealthy", "FailingStreak": 3487, "Log": [ { "Start": "2026-06-25T20:56:44.247729763+01:00", "End": "2026-06-25T20:56:49.361025818+01:00", "ExitCode": -1, "Output": "Health check exceeded timeout (5s)" }, { "Start": "2026-06-25T20:56:59.362295911+01:00", "End": "2026-06-25T20:57:04.46256419+01:00", "ExitCode": -1, "Output": "Health check exceeded timeout (5s)" }, { "Start": "2026-06-25T20:57:14.464451349+01:00", "End": "2026-06-25T20:57:19.572463971+01:00", "ExitCode": -1, "Output": "Health check exceeded timeout (5s)" },

I've disabled the health checks via the compose yaml file for now
healthcheck:
disable: true

Expected behaviour

Healthcheck not to leave processes

Actual behaviour

Healthcheck failing and leaving processes running

Steps to reproduce

Just reoccurs after a few days

Docker info

docker info
Client: Docker Engine - Community
 Version:    29.1.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.30.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v5.0.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 28
  Running: 27
  Paused: 0
  Stopped: 1
 Images: 39
 Server Version: 29.1.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1c4457e00facac03ce1d75f7b6777a7a851e5c41
 runc version: v1.3.4-0-gd6d73eb8
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.12.47+rpt-rpi-v8
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.637GiB
 Name: PI4-DC1
 ID: ec8204ce-8d3e-4ea2-bf4f-efe901fad221
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
  ::1/128
 Live Restore Enabled: false
 Firewall Backend: iptables

WARNING: No memory limit support
WARNING: No swap limit support

Docker Compose config

docker compose config
name: librenms
services:
  db:
    command:
      - mysqld
      - --innodb-file-per-table=1
      - --lower-case-table-names=0
      - --character-set-server=utf8mb4
      - --collation-server=utf8mb4_unicode_ci
    container_name: librenms_db
    environment:
      MARIADB_RANDOM_ROOT_PASSWORD: "yes"
      MYSQL_DATABASE: xxxxxx
      MYSQL_PASSWORD: xxxxxx
      MYSQL_USER: xxxxxx
      TZ: Europe/Paris
    image: mariadb:10
    networks:
      default: null
    restart: always
    volumes:
      - type: bind
        source: /home/tricky/docker/librenms/db
        target: /var/lib/mysql
        bind: {}
  dispatcher:
    cap_add:
      - NET_ADMIN
      - NET_RAW
    container_name: librenms_dispatcher
    depends_on:
      librenms:
        condition: service_started
        required: true
      redis:
        condition: service_started
        required: true
    environment:
      CACHE_DRIVER: redis
      DB_HOST: db
      DB_NAME: xxxxxxx
      DB_PASSWORD: xxxxxx
      DB_TIMEOUT: "60"
      DB_USER: xxxxxx
      DISPATCHER_NODE_ID: dispatcher1
      LIBRENMS_SNMP_COMMUNITY: librenmsdocker
      LIBRENMS_WEATHERMAP: "false"
      LIBRENMS_WEATHERMAP_SCHEDULE: '*/5 * * * *'
      LOG_IP_VAR: remote_addr
      MAX_INPUT_VARS: "1000"
      MEMORY_LIMIT: 256M
      OPCACHE_MEM_SIZE: "128"
      PGID: "1000"
      PUID: "1000"
      REAL_IP_FROM: 0.0.0.0/32
      REAL_IP_HEADER: X-Forwarded-For
      REDIS_HOST: redis
      SESSION_DRIVER: redis
      SIDECAR_DISPATCHER: "1"
      TZ: Europe/Paris
      UPLOAD_MAX_SIZE: 16M
    hostname: librenms-dispatcher
    image: librenms/librenms:latest
    networks:
      default: null
    restart: always
    volumes:
      - type: bind
        source: /home/tricky/docker/librenms
        target: /data
        bind: {}
  librenms:
    cap_add:
      - NET_ADMIN
      - NET_RAW
    container_name: librenms
    depends_on:
      db:
        condition: service_started
        required: true
      msmtpd:
        condition: service_started
        required: true
      redis:
        condition: service_started
        required: true
    environment:
      CACHE_DRIVER: redis
      DB_HOST: db
      DB_NAME: xxxxxx
      DB_PASSWORD: xxxxxxx
      DB_TIMEOUT: "60"
      DB_USER: xxxxxxx
      LIBRENMS_SNMP_COMMUNITY: librenmsdocker
      LIBRENMS_WEATHERMAP: "false"
      LIBRENMS_WEATHERMAP_SCHEDULE: '*/5 * * * *'
      LOG_IP_VAR: remote_addr
      MAX_INPUT_VARS: "1000"
      MEMORY_LIMIT: 256M
      OPCACHE_MEM_SIZE: "128"
      PGID: "1000"
      PUID: "1000"
      REAL_IP_FROM: 0.0.0.0/32
      REAL_IP_HEADER: X-Forwarded-For
      REDIS_HOST: redis
      SESSION_DRIVER: redis
      TZ: Europe/Paris
      UPLOAD_MAX_SIZE: 16M
    hostname: librenms
    image: librenms/librenms:latest
    networks:
      default: null
    ports:
      - mode: ingress
        target: 8000
        published: "8011"
        protocol: tcp
    restart: always
    volumes:
      - type: bind
        source: /home/tricky/docker/librenms
        target: /data
        bind: {}
  msmtpd:
    container_name: librenms_msmtpd
    environment:
      SMTP_AUTH: "on"
      SMTP_FROM: foo@gmail.com
      SMTP_HOST: smtp.gmail.com
      SMTP_PASSWORD: bar
      SMTP_PORT: "587"
      SMTP_STARTTLS: "on"
      SMTP_TLS: "on"
      SMTP_TLS_CHECKCERT: "on"
      SMTP_USER: foo
    healthcheck:
      disable: true
    image: crazymax/msmtpd:latest
    networks:
      default: null
    restart: always
  redis:
    container_name: librenms_redis
    environment:
      TZ: Europe/Paris
    image: redis:7.2-alpine
    networks:
      default: null
    restart: always
  snmptrapd:
    cap_add:
      - NET_ADMIN
      - NET_RAW
    container_name: librenms_snmptrapd
    depends_on:
      librenms:
        condition: service_started
        required: true
      redis:
        condition: service_started
        required: true
    environment:
      CACHE_DRIVER: redis
      DB_HOST: db
      DB_NAME: xxxxxx
      DB_PASSWORD: xxxxxxx
      DB_TIMEOUT: "60"
      DB_USER: xxxxxxx
      LIBRENMS_SNMP_COMMUNITY: librenmsdocker
      LIBRENMS_WEATHERMAP: "false"
      LIBRENMS_WEATHERMAP_SCHEDULE: '*/5 * * * *'
      LOG_IP_VAR: remote_addr
      MAX_INPUT_VARS: "1000"
      MEMORY_LIMIT: 256M
      OPCACHE_MEM_SIZE: "128"
      PGID: "1000"
      PUID: "1000"
      REAL_IP_FROM: 0.0.0.0/32
      REAL_IP_HEADER: X-Forwarded-For
      REDIS_HOST: redis
      SESSION_DRIVER: redis
      SIDECAR_SNMPTRAPD: "1"
      TZ: Europe/Paris
      UPLOAD_MAX_SIZE: 16M
    hostname: librenms-snmptrapd
    image: librenms/librenms:latest
    networks:
      default: null
    ports:
      - mode: ingress
        target: 162
        published: "162"
        protocol: tcp
      - mode: ingress
        target: 162
        published: "162"
        protocol: udp
    restart: always
    volumes:
      - type: bind
        source: /home/tricky/docker/librenms
        target: /data
        bind: {}
  syslogng:
    cap_add:
      - NET_ADMIN
      - NET_RAW
    container_name: librenms_syslogng
    depends_on:
      librenms:
        condition: service_started
        required: true
      redis:
        condition: service_started
        required: true
    environment:
      CACHE_DRIVER: redis
      DB_HOST: db
      DB_NAME: xxxxxx
      DB_PASSWORD: xxxxxx
      DB_TIMEOUT: "60"
      DB_USER: xxxxxx
      LIBRENMS_SNMP_COMMUNITY: librenmsdocker
      LIBRENMS_WEATHERMAP: "false"
      LIBRENMS_WEATHERMAP_SCHEDULE: '*/5 * * * *'
      LOG_IP_VAR: remote_addr
      MAX_INPUT_VARS: "1000"
      MEMORY_LIMIT: 256M
      OPCACHE_MEM_SIZE: "128"
      PGID: "1000"
      PUID: "1000"
      REAL_IP_FROM: 0.0.0.0/32
      REAL_IP_HEADER: X-Forwarded-For
      REDIS_HOST: redis
      SESSION_DRIVER: redis
      SIDECAR_SYSLOGNG: "1"
      TZ: Europe/Paris
      UPLOAD_MAX_SIZE: 16M
    hostname: librenms-syslogng
    image: librenms/librenms:latest
    networks:
      default: null
    ports:
      - mode: ingress
        target: 514
        published: "514"
        protocol: tcp
      - mode: ingress
        target: 514
        published: "514"
        protocol: udp
    restart: always
    volumes:
      - type: bind
        source: /home/tricky/docker/librenms
        target: /data
        bind: {}
networks:
  default:
    name: librenms_default

Logs

None

Additional info

Image

This is the processes graph after a reboot of the pi post the issue on early monday morning, also the last year:

Image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions