SurfaceFlinger/LayerRenderArea: Hold lock while modifying hierarchy
Multiple stability reports from OEMs are reporting a crash in LayerRenderArea. In particular this occurs when destroying the screenshotParentLayer container layer used by LayerRenderArea. This code executes on the main thread without a lock. Our rules for locking in SurfaceFlinger are like this: 1. The main thread will hold the lock when writing to the state, and be the only writer 2. Anyone can read the state if they hold the lock 3. Main thread can read from the state without the lock since it is the only writer. LayerRenderArea's reparentForDrawing operation violates these rules by updating mDrawingParent without holding the lock. If there were some other binder thread which was accessing the hierarchy (say calling getAlpha on a layer in the hierarchy descending from our screenshot layer), we could have a sequence like this: 1. Other thread calls mDrawingParent.promote(), on the wp we provided constructing ReparentForDrawing 2. At the same time we destroy ReparentForDrawing and overwrite the wp 3. This causes wp::~wp and wp::promote to interleave 4. While the wp counters themselves are atomic, remember the object itself is not locked, and so after ~wp the counters that we think are stored in our ~wp could be anything and so we could promote without properly incremeting the reference ccount 5. When the sp returned from promote is destroyed, it could then destroy the layer, and when our original sp from LayerRenderArea is destroyed, we then crash (which is what we see in the traces). It's hard to say what this other thread is. Over time we have tried to move usage of the Layer hierarchy off the Binder threads to achieve a totally lock-free model. At the moment, I can't find spots in tip-of-tree which would race with mDrawingParent in this fashion. The reports are from S, and from OEMs only so it's possible it relates to an OEM modification, or an issue we have already fixed. Despite this we should have a consistent locking model, and until we are ready to get rid of mStateLock, apply it's usage consistently with our three rules of locking. Bug: 220176775 Bug: 223069308 Bug: 223081111 Change-Id: I89c5add585f96b7de1f1b21f76b6ea0c6b0de127
Loading
Please register or sign in to comment