Ensure to kill all processes and reap them before unmounting
During shutdown, init stops services by sending SIGTERM or SIGKILL to the process groups they lead. Child processes of a service by default belong to the process group of the service, so, stopping the service kills everything under the service. However, this "assumption" may not hold always. Process can create a new process group by calling setpgid(3). In this case, they can outlive their parent (the service) during the shutdown. Actually they get killed at the very last moment when init issues `echo i > /proc/sysrq-trigger`. And to make things worse, init doesn't reap for those rogue processes, presumably because it's in a somewhat emergency path. As a result, following catastrophic sequence of actions can occur right before kernel enters a reboot. 1. the rogue process has issued an I/O. Kernel is in the process of handling it. 2. init sends SIGKILL (via echo i > /proc/sysrq-trigger), but due to #1, it is not immediately killed until the I/O is done. 3. init (without waiting for the I/O to complete), try to unmount the partitions, and fails (as expected). 4. init, can't help but shut the underlying hardware down by issuing F2FS_IOC_SHUTDOWN. And then jumps to the kernel. 5. kernel may still see #1 ongoing. e.g. a lock may be held, ... This change tries to fix such an event by ensuring that all processes, even those in new process groups are killed and reaped before unmounting the partitions. Bug: 420528003 Test: follow the repro steps in the bug Flag: EXEMPT bug fix (cherry picked from https://googleplex-android-review.googlesource.com/q/commit:ae8ff5b26e40802d63705402cab38c7b3c40d3ac) Merged-In: I2988dd17844e25900dacbda1c293d4e3c269eb12 Change-Id: I2988dd17844e25900dacbda1c293d4e3c269eb12
Loading