Thread (13 messages) 13 messages, 2 authors, 2021-07-13

Re: [PATCH/rfc v2] NFS: introduce NFS namespaces.

From: NeilBrown <hidden>
Date: 2021-07-07 01:12:54

On Wed, 07 Jul 2021, Daire Byrne wrote:
On Sun, 4 Jul 2021 at 00:03, NeilBrown [off-list ref] wrote:
quoted
quoted
[  360.481824] ------------[ cut here ]------------
[  360.483141] kernel BUG at mm/slub.c:4205!
Thanks for testing!

It misunderstood the use of kfree_const().  It doesn't work for
constants in modules, only constants in vmlinux.  So I guess you built
nfs as a module.

This version should fix that.

Thanks,
NeilBrown
Yep, that was the issue and the latest patch certainly helped. I ran a
few load tests and everything seemed to be working fine.

However, once I tried mounting the same server again using a different
namespace, I got a different looking crash under moderate load. I am
pretty sure I applied your latest patch correctly, but I'll double
check. I should probably remove some of the other patches I have
applied too.

# mount -o vers=4.2 server:/srv/export /mnt/server1
# mount -o vers=4.2,namespace=server2 server:/srv/export /mnt/server2

[ 3626.638077] general protection fault, probably for non-canonical
address 0x375f656c6966ff00: 0000 [#1] SMP PTI
[ 3626.640538] CPU: 9 PID: 12053 Comm: ls Not tainted 5.13.0-1.dneg.x86_64 #1
[ 3626.642270] Hardware name: Red Hat dneg, BIOS
1.11.1-4.module_el8.2.0+320+13f867d7 04/01/2014
[ 3626.644443] RIP: 0010:__kmalloc_track_caller+0xfa/0x480
[ 3626.646138] Code: 65 4c 03 05 28 4d d5 69 49 83 78 10 00 4d 8b 20
0f 84 4c 03 00 00 4d 85 e4 0f 84 43 03 00 00 41 8b 47 28 49 8b 3f 48
8d 4a 01 <49> 8b 1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 41
8b 47
[ 3626.650253] RSP: 0018:ffffaadecf2afb90 EFLAGS: 00010206
[ 3626.651747] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000003d41
[ 3626.653479] RDX: 0000000000003d40 RSI: 0000000000000cc0 RDI: 000000000002fbe0
[ 3626.655293] RBP: ffffaadecf2afbd0 R08: ffff985aabc6fbe0 R09: ffff985689c76b20
[ 3626.657034] R10: ffff9858408a0000 R11: ffff985966e69ec0 R12: 375f656c6966ff00
[ 3626.658794] R13: 0000000000000000 R14: 0000000000000cc0 R15: ffff985680042200
The above Code: shows the crash happens at

  2a:*	49 8b 1c 04          	mov    (%r12,%rax,1),%rbx		<-- trapping instruction

and %r12 (which should be a memory address) is 375f656c6966ff00, which
contains ASCII "file_7".
So my guess is that a file name was copied into a buffer that had
already been freed.
This could be caused by a malloc bug somewhere else, but as the crash
was in readdir code, and shows evidence of a file name, it seems likely
that the bug is near by.  Do you have patches to anything that works
with file names?

NeilBrown
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help