Thread (2 messages) 2 messages, 2 authors, 2009-06-29

Re: kerneloops.org report for the week

From: Ingo Molnar <hidden>
Date: 2009-06-29 03:18:47
Also in: lkml

* Arjan van de Ven [off-list ref] wrote:
Few "highlights" this week
* mem_cgroup_add_lru_list (rank 2) is a high rising issue;
  it's list corruption, question is why this is new
* rank 13 (memcmp in the raid code) is also new
* the warning in get_free_pages that has been discussed on lkml is dropping
  from the ranks again


This week, a total of 15273 oopses and warnings have been reported,
compared to 13384 reports in the previous week.


Rank 2: mem_cgroup_add_lru_list (warn)
	Reported 1554 times (1622 total reports)
	List corruption in the VM code
	This oops was last seen in version 2.6.30-git19, and first seen in 2.6.29.
	More info: http://www.kerneloops.org/searchweek.php?search=mem_cgroup_add_lru_list
At least one list corruption bug was fixed by:

   cb4cbcf: mm: fix incorrect page removal from LRU
Rank 3: getnstimeofday (warning)
	Reported 1319 times (4893 total reports)
	[suspend resume] getnstimeofday() is called before timekeeping is resumed
	This oops was last seen in version 2.6.30, and first seen in 2.6.24.
	More info: http://www.kerneloops.org/searchweek.php?search=getnstimeofday
Probably caused by some buggy driver callback?
Rank 7: hres_timers_resume (warning)
	Reported 763 times (2368 total reports)
	[suspend resume] hres_timers_resume() is incorrectly called with interrupts on
	This warning was last seen in version 2.6.30, and first seen in 2.6.24.7.
	More info: http://www.kerneloops.org/searchweek.php?search=hres_timers_resume
This is probably a driver incorrectly enabling irqs in a resume 
callback. This should be easier and more specific to debug with the 
lockdep based annotation i suggested for the suspend code in various 
`mails.
Rank 8: generic_get_mtrr (warning)
	Reported 544 times (2061 total reports)
	BIOS bug where the MTRRs are not set up correctly
	This warning was last seen in version 2.6.30, and first seen in 2.6.25.3.
	More info: http://www.kerneloops.org/searchweek.php?search=generic_get_mtrr
I think this calls for enabling the x86 MTRR sanitizer by default - 
500 out of 15000 reports suggests a significant proportion of Linux 
systems is affected by MTRR setup problems.

I.e. we should change:

config MTRR_SANITIZER_ENABLE_DEFAULT
        int "MTRR cleanup enable value (0-1)"
        range 0 1
        default "0"

To 'default "1"'. Any objections?

If the MTRR sanitizer is enabled then i think the above warning in 
generic_get_mtrr() should never trigger.

	Ingo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help