Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work

[PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Dai Ngo <dai.ngo@oracle.com> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · dai.ngo@oracle.com · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Chuck Lever III <chuck.lever@oracle.com> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · dai.ngo@oracle.com · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · dai.ngo@oracle.com · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Mike Galbraith <hidden> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Mike Galbraith <hidden> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · dai.ngo@oracle.com · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Mike Galbraith <hidden> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Mike Galbraith <hidden> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Mike Galbraith <hidden> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-11
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · dai.ngo@oracle.com · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Chuck Lever III <chuck.lever@oracle.com> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · dai.ngo@oracle.com · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Jeff Layton <jlayton@kernel.org> · 2023-01-10
Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work · Chuck Lever III <chuck.lever@oracle.com> · 2023-01-10

From: Mike Galbraith <hidden>
Date: 2023-01-11 11:21:42

On Wed, 2023-01-11 at 05:55 -0500, Jeff Layton wrote:

quoted

crash> delayed_work ffff8881601fab48
struct delayed_work {
  work = {
    data = {
      counter = 1
    },
    entry = {
      next = 0x0,
      prev = 0x0
    },
    func = 0x0
  },
  timer = {
    entry = {
      next = 0x0,
      pprev = 0x0
    },
    expires = 0,
    function = 0x0,
    flags = 0
  },
  wq = 0x0,
  cpu = 0
}

That looks more like a memory scribble or UAF. Merely having multiple
tasks calling queue_work at the same time wouldn't be enough to trigger
this, IMO. It's more likely that the extra locking is changing the
timing of your reproducer somehow.

It might be interesting to turn up KASAN if you're able.

I can try that.

If you still have this vmcore, it might be interesting to do the pointer
math and find the nfsd_net structure that contains the above
delayed_work. Does the rest of it also seem to be corrupt? My guess is
that the corrupted structure extends beyond just the delayed_work above.

Also, it might be helpful to do this:

     kmem -s ffff8881601fab48

...which should tell us whether and what part of the slab this object is
now a part of. That said, net-namespace object allocations are somewhat
weird, and I'm not 100% sure they come out of the slab.

I tossed the vmcore, but can generate another.  I had done kmem sans -s
previously, still have that.

crash> kmem ffff8881601fab48
CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
kmem: kmalloc-1k: partial list slab: ffffea0005b20c08 invalid page.inuse: -1
ffff888100041840     1024       2329      2432     76    32k  kmalloc-1k
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  ffffea0005807e00  ffff8881601f8000     0     32         32     0
  FREE / [ALLOCATED]
  [ffff8881601fa800]

      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0005807e80 1601fa000 dead000000000400        0  0 200000000000000
crash

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help