Kill untracked process early
Untracked processes are processes whose process group is not lead by any service. Such proceses can exist because, a process by default belongs to the process group of its parent, but can decide join to a different process group (via setpgid) and even become a new leader. Previously, such untracked processes were killed (by sysrq) quite lately in the shutdown sequence; after spending considerable amount of time (5-6 sec) reaping those tracked processes. So, by the time when the untracked processes are killed, we often are very close to the shutdown timeout or already over it. Then init doesn't wait for these processes to be completely reaped, and goes on to the next step of unmounting partitions. One consequence of this incomplete reaping is that the unmountings could fail because some of the untracked processes may still be opening some files on the partitions. init tries its best to do the unmounting by trying it several times, hoping that the processes will eventually be reaped. But when the timeout is over, init can't help but giving up and shutting the filesystem forcibly down, which can manifest as a ramdump. This change fixes this issue by killing those untracked processes early; specifically along with those tracked services. Bug: 433451565 Test: run a microdroid VM, and `setprop sys.powerctl reboot`. Observe kernel log: [ 1278.258925] init: Received sys.powerctl='reboot' from pid: 2918 (setprop) [ 1278.259826] init: Starting RebootMonitorThread ... [ 1278.378265] init: Stopping untracked process group 2915 by sending SIGTERM [ 1278.378416] init: Stopping untracked process group 2913 by sending SIGTERM [ 1278.378526] init: Stopping untracked process group 2851 by sending SIGTERM [ 1278.378692] init: Stopping untracked process group 2847 by sending SIGTERM ... [ 1278.477114] init: Untracked process (pid: 2851 name: (binder:2851_2) ppid: 1 pgrp: 2851 state: Z) received SIGTERM [ 1278.477219] init: Untracked process (pid: 2851 name: (binder:2851_2) ppid: 1 pgrp: 2851 state: Z) did not have an associated service entry and will not be reaped [ 1278.514514] init: Untracked process (pid: 2874 name: (crosvm_VmRunApp) ppid: 1 pgrp: 2851 state: Z) received SIGTERM [ 1278.514675] init: Untracked process (pid: 2874 name: (crosvm_VmRunApp) ppid: 1 pgrp: 2851 state: Z) did not have an associated service entry and will not be reaped See the time difference between the start of the reboot thread and the killing of the crosvm process. Previously it was 5-6 sec apart. Now it's ~100ms. Flag: EXEMPT bug fix Change-Id: Icb05171a71365acef0a692fe15ef78e62bc29d3b
Loading
Please register or sign in to comment