Thread (12 messages) 12 messages, 4 authors, 2016-10-05

Re: Frequent ext4 oopses with 4.4.0 on Intel NUC6i3SYB

From: Johannes Bauer <hidden>
Date: 2016-10-04 21:54:30
Also in: linux-mm

On 04.10.2016 22:17, Andrey Korolyov wrote:
quoted
I'm super puzzled right now :-(
There are three strawman` ideas out of head, down by a level of
naiveness increase:
- disk controller corrupts DMA chunks themselves, could be tested
against usb stick/sd card with same fs or by switching disk controller
to a legacy mode if possible, but cascading failure shown previously
should be rather unusual for this,
I'll check out if this is possible somehow tomorrow.
- SMP could be partially broken in such manner that it would cause
overlapped accesses under certain conditions, may be checked with
'nosmp',
Unfortunately not:

  CC [M]  drivers/infiniband/core/multicast.o
  CC [M]  drivers/infiniband/core/mad.o
drivers/infiniband/core/mad.c: In function ‘ib_mad_port_close’:
drivers/infiniband/core/mad.c:3252:1: internal compiler error: Bus error
 }
 ^

nuc [~]: cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.8.0 root=UUID=f6a792b3-3027-4293-a118-f0df1de9b25c
ro ip=:::::eno1:dhcp nosmp
- disk accesses and corresponding power spikes are causing partial
undervoltage condition somewhere where bits are relatively freely
flipping on paths without parity checking, though this could be
addressed only to an onboard power distributor, not to power source
itself.
Huh that sounds like "defective hardware" to me, wouldn't it?

Cheers and thank you for your help,
Johannes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help