Thread (6 messages) 6 messages, 2 authors, 2015-08-03

Re: crash: kernel bug in bch_generic_make_request

From: Jens-U. Mozdzen <hidden>
Date: 2015-08-03 10:02:25

When it rains, it pours.

Zitat von "Jens-U. Mozdzen" [off-list ref]:
Zitat von "Jens-U. Mozdzen" [off-list ref]:
quoted
Hi everybody,

we experience reproducible server crashes (reboots) when creating  
new file systems on bcache'd devices.
update: If I set cache_mode to "writethrough", I can successfully  
create the file system.
over the weekend, I received multiple successive reports about disks  
failing in the RAID, on both servers. This is a rather new hardware  
setup, no disk is older than 6 months, so I didn't actually believe in  
failing disks to be the reason.

In syslog I noticed reports from the upper layers (SCST, which is  
using files on the ext4->DRBD->LVM->bcache->MD-RAID chain) indicating  
stalling disk access; DRBD reporting stalling updates from the remote  
server), both indicating some kind of locking condition inside the  
kernel. The RAID failures were right after these incidents.

With identical workload, but having turned caching back to  
"writeback", no more "RAID failures" were reported.

We received end-user reports that system access sometimes hangs for a  
few seconds, which now makes me believe that the actual cause lies  
somewhere within the software stack on our servers, most probably  
within the bcache layer (as no such problems were reported before  
introducing bcache to our setup).

Is there anyone out there that is running bcache on MD-RAIDs (both for  
data and cache device) with significant I/O volume, successfully? I'd  
like to compare setups then :)

Regards,
Jens
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help