Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 1f4c4503 authored by Christopher Tate's avatar Christopher Tate
Browse files

Fix deadlock when full data backup times out

The code was attempting to let a reported error in the app <-> engine
surface take precedence over apparent success at the engine <-> transport
handoff surface.  However, in the case of timeout, this is inappropriate.
It was leading to deadlock because the engine runs free, with socket-closed
as its shutdown signals for determinism.  In this case that means that
having accidentally asked it to finish and report the final result, we
locked up forever since the data it was writing dutifully to the engine
was no longer being consumed, and the actual teardown signals were never
sent.

The fix is to properly express the error-state hierarchy: only when the
engine <-> transport layer is not issuing its own abort is the app-data-
moving layer consulted about errors detected at that surface.

Bug 22348852

Change-Id: I8987be0c4f708116dfeb08098d7222241ed317f3
parent 001269ff
Loading
Loading
Loading
Loading
+22 −8
Original line number Diff line number Diff line
@@ -2918,9 +2918,15 @@ public class BackupManagerService {
                mBackupRunning = false;
                if (mStatus == BackupTransport.TRANSPORT_NOT_INITIALIZED) {
                    // Make sure we back up everything and perform the one-time init
                    clearMetadata();
                    if (MORE_DEBUG) Slog.d(TAG, "Server requires init; rerunning");
                    addBackupTrace("init required; rerunning");
                    try {
                        mPendingInits.add(mTransport.transportDirName());
                    } catch (Exception e) {
                        Slog.w(TAG, "Failed to query transport name heading for init", e);
                        // swallow it and proceed; we don't rely on this
                    }
                    clearMetadata();
                    backupNow();
                }
            }
@@ -4451,14 +4457,22 @@ public class BackupManagerService {
                            }
                        }

                        // TRANSPORT_ERROR here means that we've hit an error that the runner
                        // doesn't know about, so it's still moving data but we're pulling the
                        // rug out from under it.  Don't ask for its result:  we already know better
                        // and we'll hang if we block waiting for it, since it relies on us to
                        // read back the data it's writing into the engine.  Just proceed with
                        // a graceful failure.  The runner/engine mechanism will tear itself
                        // down cleanly when we close the pipes from this end.
                        if (backupPackageStatus != BackupTransport.TRANSPORT_ERROR) {
                            // We still could fail in backup runner thread, getting result from there.
                            int backupRunnerResult = backupRunner.getBackupResultBlocking();
                        if (backupPackageStatus != BackupTransport.TRANSPORT_ERROR
                                && backupRunnerResult != BackupTransport.TRANSPORT_OK) {
                            if (backupRunnerResult != BackupTransport.TRANSPORT_OK) {
                                // If there was an error in runner thread and
                                // not TRANSPORT_ERROR here, overwrite it.
                                backupPackageStatus = backupRunnerResult;
                            }
                        }

                        if (MORE_DEBUG) {
                            Slog.i(TAG, "Done trying to send backup data: result="