Thread (25 messages) 25 messages, 11 authors, 2006-11-29

Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP

From: Jesper Juhl <hidden>
Date: 2006-11-21 23:51:46
Also in: linux-scsi, lkml

On 22/11/06, David Chinner [off-list ref] wrote:
On Tue, Nov 21, 2006 at 11:02:23PM +0100, Jesper Juhl wrote:
quoted
On 21/11/06, David Chatterton [off-list ref] wrote:
...
quoted
quoted
Audits have been done in the past and will again be done in the future to
try to
identify areas where XFS could use less stack space by reducing/avoid large
local variables. Reducing the code path is far more difficult.
I realize that fixing the problem may be difficult. I just wanted to
make sure that people were informed that there is an actual problem
and provide as much info as possible so that perhaps in the future it
can be fixed... :)
I've got one that prevents gcc from inlining single use functions in XFS
that I need to finish off, and that results in some significant stack
usage reductions in some XFS functions.
That sounds good. I'll be keeping an eye out for that one :)
However, XFS is only one part of the picture - when you put NFS on top,
DM+md then scsi/FC below and then you nest a soft irq that might go
20 functions deep as well - then 4k stacks simply aren't big enough.
True, there are a lot of players involved here, although XFS seems (to
me) to be the biggest one.
quoted
I'm reading through the XFS code myself at the moment and I'll be sure
to submit patches if I spot something that could help reduce stack
usage.
Most of the low hanging fruit is already gone. The problem we are
facing now for further reductions in stack usage is the fact that we
need to factor code. That is a major undertaking and has a _lot_ of
risk associated with it....
I'll try to spot some of the remaining low hanging fruit ;)

quoted
quoted
There is active discussion about reducing inlining:
http://bugzilla.kernel.org/show_bug.cgi?id=7364
Thanks, I'll check that out.
That's one of the few remaining low hanging fruit, and that's fixed
in the patches I already have.
Nice. Will be good to get that in.

quoted
quoted
Thanks for traces, I've captured this information.
You are welcome. If you want/need more traces then I've got ~2.1G
worth of traces that you can have :)
Well, we don't need that many, but it would be nice to have a
set of unique traces that lead to overflows - could you process
them in some way just to extract just the unique XFS traces that
occur?
I'll try to extract a copy of each unique trace that involves xfs,
sometime tomorrow or the day after, and then send you the result.


-- 
Jesper Juhl [off-list ref]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help