Thread (2 messages) 2 messages, 2 authors, 2011-11-16

Re: possible bug - bitmap dirty pages status

From: CoolCold <hidden>
Date: 2011-11-16 09:36:02

On Wed, Nov 16, 2011 at 7:07 AM, NeilBrown [off-list ref] wrote:
On Wed, 16 Nov 2011 03:13:51 +0400 CoolCold [off-list ref] wrote:
quoted
As I promised I was collecting data, but forgot to return to that
problem, bumping thread returned me to that state ;)
So, data was collected for almost the month - from 31 August to 26 September:
root@gamma2:/root# grep -A 1 dirty component_examine.txt |head
          Bitmap : 44054 bits (chunks), 190 dirty (0.4%)
Wed Aug 31 17:32:16 MSD 2011

root@gamma2:/root# grep -A 1 dirty component_examine.txt |tail -n 2
          Bitmap : 44054 bits (chunks), 1 dirty (0.0%)
Mon Sep 26 00:28:33 MSD 2011

As i can understand from that dump, it was bitmap examination (-X key)
of component /dev/sdc3 of raid /dev/md3.
Decreasing happend, though after some increase on 23 of September, and
first decrease to 0 happened on 24 of September (line number 436418).

So almost for month, dirty count was no decreasing!
I'm attaching that log, may be it will help somehow.
Thanks a lot.
Any idea what happened at on Fri Sep 23??
Between 6:23am and midnight the number of dirty bits dropped from 180 to 2.
Have no idea, sorry. 6.25 am scheduled in cron for logrotation, but
6.23 has nothing specific

But changes (dirty increase) begun to happen on 2:30 AM , which
corresponds with some cron-running script which does data import &
database update  - database lives on that LVMed md array.
This does seem to suggest that md is just losing track of some of the pages
of bits and once they are modified again md remembers to flush them and write
them out - which is a fairly safe way to fail.

The one issue I have found is that set_page_attr uses a non-atomic __set_bit
because it should always be called under a spinlock.  But bitmap_write_all()
- which is called when a spare is added - calls it without the spinlock so
that could corrupt some of the bits.

Thanks,
NeilBrown


-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help