Re: [PATCH 8/8] xfs: Fix CIL throttle hang when CIL space used going backwards
From: Dave Chinner <david@fromorbit.com>
Date: 2021-02-24 22:05:54
On Wed, Feb 24, 2021 at 01:18:10PM -0800, Darrick J. Wong wrote:
On Tue, Feb 23, 2021 at 02:34:42PM +1100, Dave Chinner wrote:quoted
From: Dave Chinner <redacted> A hang with tasks stuck on the CIL hard throttle was reported and largely diagnosed by Donald Buczek, who discovered that it was a result of the CIL context space usage decrementing in committed transactions once the hard throttle limit had been hit and processes were already blocked. This resulted in the CIL push not waking up those waiters because the CIL context was no longer over the hard throttle limit. The surprising aspect of this was the CIL space usage going backwards regularly enough to trigger this situation. Assumptions had been made in design that the relogging process would only increase the size of the objects in the CIL, and so that space would only increase. This change and commit message fixes the issue and documents the result of an audit of the triggers that can cause the CIL space to go backwards, how large the backwards steps tend to be, the frequency in which they occur, and what the impact on the CIL accounting code is.
....
Does this whole series fix the Donald's problem?
No, just this patch is needed to fix that problem. Cheers, Dave. -- Dave Chinner david@fromorbit.com