xfs: push buffer of flush locked dquot to avoid quotacheck deadlock
Reclaim during quotacheck can lead to deadlocks on the dquot flush lock: - Quotacheck populates a local delwri queue with the physical dquot buffers. - Quotacheck performs the xfs_qm_dqusage_adjust() bulkstat and dirties all of the dquots. - Reclaim kicks in and attempts to flush a dquot whose buffer is already queud on the quotacheck queue. The flush succeeds but queueing to the reclaim delwri queue fails as the backing buffer is already queued. The flush unlock is now deferred to I/O completion of the buffer from the quotacheck queue. - The dqadjust bulkstat continues and dirties the recently flushed dquot once again. - Quotacheck proceeds to the xfs_qm_flush_one() walk which requires the flush lock to update the backing buffers with the in-core recalculated values. It deadlocks on the redirtied dquot as the flush lock was already acquired by reclaim, but the buffer resides on the local delwri queue which isn't submitted until the end of quotacheck. This is reproduced by running quotacheck on a filesystem with a couple million inodes in low memory (512MB-1GB) situations. This is a regression as of commit 43ff2122 ("xfs: on-stack delayed write buffer lists"), which removed a trylock and buffer I/O submission from the quotacheck dquot flush sequence. Quotacheck first resets and collects the physical dquot buffers in a delwri queue. Then, it traverses the filesystem inodes via bulkstat, updates the in-core dquots, flushes the corrected dquots to the backing buffers and finally submits the delwri queue for I/O. Since the backing buffers are queued across the entire quotacheck operation, dquot reclaim cannot possibly complete a dquot flush before quotacheck completes. Therefore, quotacheck must submit the buffer for I/O in order to cycle the flush lock and flush the dirty in-core dquot to the buffer. Add a delwri queue buffer push mechanism to submit an individual buffer for I/O without losing the delwri queue status and use it from quotacheck to avoid the deadlock. This restores quotacheck behavior to as before the regression was introduced. Reported-by:Martin Svec <martin.svec@zoner.cz> Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
Loading
Please register or sign in to comment
