Thread (129 messages) 129 messages, 41 authors, 2007-11-19

Re: [BUG] New Kernel Bugs

From: Andrew Morton <akpm@linux-foundation.org>
Date: 2007-11-14 02:12:26
Also in: alsa-devel, linux-ide, lkml, netdev

Possibly related (same subject, not in this thread)

On Tue, 13 Nov 2007 17:11:36 -0800 Stephen Hemminger [off-list ref] wrote:
On Tue, 13 Nov 2007 19:52:17 -0500
Chuck Ebbert [off-list ref] wrote:
quoted
On 11/13/2007 04:12 PM, Alan Cox wrote:
quoted
quoted
Bug fixing is not about finding someone to blame, it's about getting the 
bug fixed.
Partly - its also about understanding why the bug occurred and making it
not happen again.
Very few people think about that part.
Why does the kernel have very few useful tests?
Tests would of course be nice, but they aren't very useful(!)

Looking at this list which Natalie has generated I see around thirty which
are dependent on the right hardware and ten which are not.  This ratio is
typical, I think.  In fact I'd say that more than 75% of reported bugs are
dependent on hardware.

So the best test of all for the kernel is "run it on a different machine". 
This is why we are sooooo dependent upon our volunteer testers/reporters to
be able to do kernel development.
 Lack of interest? resources? expertise?
Ideally each new feature would just be a small add on to an existing test.
Sure.  For system-call-visible features it would be good to do that.

But this tends not to be where bugs get exposed.  Because the original
developer can 100% exercise such code.  That isn't the case with
driver/arch/platform changes.
Unlike developing new features which seems to grow well with more developers.
Bug fixing also seems to be a scarcity process. There often seems to be
a very few people that understand the problem well enough or have the necessary
hardware to reproduce and fix the problem.
We're 100% dead if "having the hardware" is a prerequisite to fixing a bug.
The terminal state there is that the kernel runs on about 200 machines
worldwide.  We have to work with reporters via email to fix these sorts of
things.  As we of course do.
Recent changes like tickless and scheduler rework were well thought out and caused
very little impact to 90% of the users. The problem is the 10% who do have problems.
Worse, the developers often only hear about the a small sample of those.
Yes.  An unknown number of people just shrug and go back to an old kernel.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help