Re: [PATCH 0/2] btrfs: eliminate a deadlock when allocating system chunks and rework chunk allocation
From: Johannes Thumshirn <hidden>
Date: 2021-06-30 14:50:07
On 30/06/2021 15:12, David Sterba wrote:
On Tue, Jun 29, 2021 at 02:43:04PM +0100, fdmanana@kernel.org wrote:quoted
From: Filipe Manana <redacted> The first patch eliminates a deadlock when multiple tasks need to allocate a system chunk. It reverts a previous fix for a problem that resulted in exhausting the system chunk array and result in a transaction abort when there are many tasks allocating chunks in parallel. Since there is not a simple and short fix for the deadlock that does not bring back the system array exhaustion problem, and the deadlock is relatively easy to trigger on zoned filesystem while the exhaustion problem is not so common, this first patch just revets that previous fix. The second patch reworks a bit of the chunk allocation code so that we don't hold onto reserved system space from phase 1 to phase 2 of chunk allocation, which is what leads to system chunk array exhaustion when there's a bunch of tasks doing chunks allocations in parallel (initially observed on PowerPC, with a 64K node size, when running the fallocate tests from stress-ng). The diff of this patch is quite big, but about half of it are just comments.The description of the chunk allocation process is great, thanks. Patches added to misc-next.
I also have a first positive response from Shinichiro that he can't reproduce the hangs in a quick run. He'll probably responds with his 'Tested-by' once the complete tests are done.