Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Skip to content
Commit c486e622 authored by Lorenzo Colitti's avatar Lorenzo Colitti
Browse files

Don't start iptables and ip6tables on parallel threads.

This still ensures that the child processes initialize in
parallel (which is the slow part of starting them, since
forkAndExec itself is sub-millisecond), but it does not fork them
in parallel.

Forking them in parallel is incorrect because forkAndExec first
creates pipes, then forks, then closes the pipes. If this is done
in parallel on two threads, the following can happen:

1. Create pipes to subprocess A.
2. Create pipes to subprocess B.
3. Fork process A.
4. Fork process B.
5. Close child end of pipes to subprocess A.
6. Close child end of pipes to subprocess B.

If this happens, then the subprocess B will inherit all the fds
of the pipes to subprocess A in addition to its own. This
includes the read end of the pipe that netd uses to send commands
to subprocess A, and the write ends of the pipes that netd uses
to receive subprocess A's stdout and stderr. Subprocess B never
reads or writes to these fds because it never dup2()s them to
stdin/stdout/stderr before calling exec(). However, it also never
closes them. Therefore, if process A dies (e.g., due to an
invalid command), then netd will never receive a POLLHUP on the
read ends or a POLLERR on the write end of these pipes.

This means that:

1. The invalid command to subprocess A will time out (imposing a
   5-second delay) instead of failing fast.
2. Even though the timeout causes us to call stop() on subprocess
   A, and stop() causes processTerminated to be set to true,
   sendCommand does not check processTerminated, it only checks
   outputReady. Therefore, when we get told to send another
   command to subprocess A, if subprocess B is still around,
   outputReady will return true. The result is that we do not
   fork a new subprocess to replace subprocess A.

These effects caused flakiness of TestRestartOnMalformedCommand
and a latency hit to the first invalid command sent after boot.

(cherry picked from commit f9bae84e)

Bug: 28362720
Bug: 64687442
Test: bullhead builds, boots
Test: netd_{unit,integration}_test pass
Test: TestRestartOnMalformedCommand passes 100 times in a row
Change-Id: I6d1bd4587a98732c2159bfc9bb66716aace3a1f8
Merged-In: I50bb3e987ee5da68812fc3348c8050f588402780
parent 83f6e26c
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment