https://testgrid.k8s.io/presubmits-node-problem-detector#pull-npd-e2e-test starts to fail recently.
[1] NPD should export Prometheus metrics. When OOM kills and docker hung happen
[1] NPD should update problem_counter and problem_gauge
[1] /home/prow/go/src/k8s.io/node-problem-detector/test/e2e/metriconly/metrics_test.go:158
[2] error dialing prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:54804->35.184.209.153:22: read: connection reset by peer', retrying
[2] error dialing prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:52980->35.184.209.153:22: read: connection reset by peer', retrying
[2] error dialing prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:53002->35.184.209.153:22: read: connection reset by peer', retrying
[2] error dialing prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:44696->35.184.209.153:22: read: connection reset by peer', retrying
[2] Error storing debugging data to test artifacts: [Error running command: {prow 35.184.209.153 curl http://localhost:20257/metrics 0 error getting SSH client to prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:52990->35.184.209.153:22: read: connection reset by peer'}
[2] Error running command: {prow 35.184.209.153 sudo journalctl -u node-problem-detector.service 0 error getting SSH client to prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:44688->35.184.209.153:22: read: connection reset by peer'}
[2] Error running command: {prow 35.184.209.153 sudo journalctl -k 0 error getting SSH client to prow@35.184.209.153:22: 'ssh: handshake failed: read tcp 10.32.2.7:44708->35.184.209.153:22: read: connection reset by peer'}
[2] ]
This is affecting several different PRs: #955, #961, #969.
https://testgrid.k8s.io/presubmits-node-problem-detector#pull-npd-e2e-test starts to fail recently.
This is affecting several different PRs: #955, #961, #969.