Thread (39 messages) 39 messages, 7 authors, 2018-06-18

Re: general protection fault in wb_workfn (2)

From: Dmitry Vyukov <dvyukov@google.com>
Date: 2018-06-08 16:53:59
Also in: linux-fsdevel, lkml

On Fri, Jun 8, 2018 at 5:16 PM, Dmitry Vyukov [off-list ref] wrote:
quoted
On Fri, Jun 8, 2018 at 4:31 AM, Tetsuo Handa
[off-list ref] wrote:
quoted
Dmitry Vyukov wrote:
quoted
On Tue, Jun 5, 2018 at 3:45 PM, Tetsuo Handa
[off-list ref] wrote:
quoted
Dmitry, can you assign VM resources for a git tree for this bug? This bug wants to fight
against https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches ...
Hi Tetsuo,

Most of the reasons for not doing it still stand. A syzkaller instance
will produce not just this bug, it will produce hundreds of different
bugs. Then the question is: what to do with these bugs? Report all to
mailing lists?
Is it possible to add linux-next.git tree as a target for fuzzing? If yes,
we can try debug patches easily, in addition to find bugs earlier than now.
syzbot tested linux-next and mmotm initially, but they were removed at
the request of kernel developers. See:
https://groups.google.com/d/msg/syzkaller/0H0LHW_ayR8/dsK5qGB_AQAJ
and:
https://groups.google.com/d/msg/syzkaller-bugs/FeAgni6Atlk/U0JGoR0AAwAJ
Indeed, linux-next produces around 50 assorted one-off unexplainable
bug reports.

quoted
quoted
I think the solution here is just to run syzkaller instance locally.
It's just a program anybody can run it on any kernel with any custom
patches. Moreover for local instance it's also possible to limit set
of tested syscalls to increase probability of hitting this bug and at
the same time filter out most of other bugs.
If this bug is reproducible with VM resources individual developer can afford...

Since my Linux development environment is VMware guests on a Windows PC, I can't
run VM instance which needs KVM acceleration. Also, due to security policy, I can't
utilize external VM resources available on the Internet, as well as I can't use ssh
and git protocols. Speak of this bug, even with a lot of VM instances, syzbot can
reproduce this bug only once or twice per a day. Thus, the question for me boils
down to, whether I can reproduce this bug using one VMware guest instance with 4GB
of memory. Effectively, I don't have access to environments for running syzkaller
instance...
Well, I don't know what to say, it does require some resources.
quoted
quoted
Do we have any idea about the guilty subsystem? You mentioned
bdi_unregister, why? What would be the set of syscalls to concentrate
on?
I will do a custom run when I get around to it, if nobody else beats me to it.
Because bdi_unregister() does "bdi->dev = NULL;" which wb_workfn() is hitting
NULL pointer dereference.
Right, wb_workfn is not a generic function, it's fs-specific function.

Trying to reproduce this locally now.

No luck so far.

Trying to look from a different angle: is it possible that bdi->dev is
not set yet, rather then already reset?

I was able to reproduce this once locally running syz-crush utility
replaying one of the crash logs. Now running with Tetsuo's patch.

I can say we hunting a very subtle race condition with short
inconsistency window, perhaps few instructions.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help