Repro: https://gist.github.com/oleg-vinted/97ee727e8d0b1050f40fc3ec9f940281
Even with the process group change, in some cases processes can still be orphaned.
In that repro, we have: overman -> service -> stubborn-process -> sleep process hierarchy.
What I think is happening:
- On Ctrl-C, SIGINT is sent only to
overman, because its children are running in a different process group.
- Overman sends SIGTERM to the child process group correctly, impacting
service, stubborn-process, sleep.
service and sleep terminate upon receiving SIGTERM.
stubborn-process ignores SIGTERM and spawns another instance of sleep and keeps running.
overman sees that its immediate child (service) has exited and removes its PID from the running list: through here, here and here.
overman thinks its children have exited, so it does not send SIGKILL to the process group, even though SIGKILL is still necessary because stubborn-process is still running (Edit: not true, sending a SIGKILL to the process group doesn't work at this point. We need some other way to detect if children have exited.)
Repro: https://gist.github.com/oleg-vinted/97ee727e8d0b1050f40fc3ec9f940281
Even with the process group change, in some cases processes can still be orphaned.
In that repro, we have:
overman -> service -> stubborn-process -> sleepprocess hierarchy.What I think is happening:
overman, because its children are running in a different process group.service,stubborn-process,sleep.serviceandsleepterminate upon receiving SIGTERM.stubborn-processignores SIGTERM and spawns another instance ofsleepand keeps running.overmansees that its immediate child (service) has exited and removes its PID from the running list: through here, here and here.overmanthinks its children have exited,so it does not send SIGKILL to the process group, even though SIGKILL is still necessary because(Edit: not true, sending a SIGKILL to the process group doesn't work at this point. We need some other way to detect if children have exited.)stubborn-processis still running