Mysterious ENOSPC [was: XFS fallocate implementation incorrectly reports ENOSPC]
From: Chris Dunlop <hidden>
Date: 2021-08-28 00:21:45
On Sat, Aug 28, 2021 at 08:03:43AM +1000, Dave Chinner wrote:
On Fri, Aug 27, 2021 at 04:53:47PM +1000, Chris Dunlop wrote:quoted
On 8/25/21 9:06 PM, Chris Dunlop wrote:quoted
Background: I'm chasing a mysterious ENOSPC error on an XFS filesystem with way more space than the app should be asking for. There are no quotas on the fs. Unfortunately it's a third party app and I can't tell what sequence is producing the error, but this fallocate issue is a possibility.Oh, another reference: this is extensive reflinking happening on this filesystem.Ah. Details that are likely extremely important. The workload, layout problems and ephemeral ENOSPC symptoms match the description of the problem that was fixed by the series of commits that went into 5.13 that ended in this one: commit fd43cf600cf61c66ae0a1021aca2f636115c7fcb Author: Brian Foster [off-list ref] Date: Wed Apr 28 15:06:05 2021 -0700 xfs: set aside allocation btree blocks from block reservation
Oh wow. Yes, sounds like a candidate. Is there same easy(-ish?) way of seeing if this fs is likely to be suffering from this particular issue or is it a matter of installing an appropriate kernel and seeing if the problem goes away? The job getting this ENOSPC error is one of 45 similar jobs, and it's the only one getting the error. There doesn't seem to be anything special about this job, it's main file where the writes are going is the 9th largest (up to 1.8T), and it has a lot of extents (842G split into 750M extents) but not as many as some others (e.g. 809G split into 1G extents). That said, the app works in mysterious ways so this particular job may be a special snowflake in some unobvious manner. Cheers, Chris