Re: [PATCH] irq: Add node_affinity CPU masks for smarter irqbalance hints
From: Peter Zijlstra <peterz@infradead.org>
Date: 2009-11-24 08:38:27
Also in:
lkml
On Mon, 2009-11-23 at 15:32 -0800, Waskiewicz Jr, Peter P wrote:
Unfortunately, a driver can't. The irq_set_affinity() function isn't exported. I proposed a patch on netdev to export it, and then to tie down an interrupt using IRQF_NOBALANCING, so irqbalance won't touch it. That was rejected, since the driver is enforcing policy of the interrupt balancing, not irqbalance.
Why would a patch touching the irq subsystem go to netdev? What is wrong with exporting irq_set_affinity(), and wtf do you need IRQF_NOBALANCING for?
I and Jesse Brandeburg had a meeting with Arjan about this. What we came up with was this interface, so drivers can set what they'd like to see, if irqbalance decides to honor it. That way interrupt affinity policies are set only by irqbalance, but this interface gives us a mechanism to hint to irqbalance what we'd like it to do.
If all you want is to expose policy to userspace then you don't need any of this, simply expose the NICs home node through a sysfs device thingy (I was under the impression its already there somewhere, but I can't ever find anything in /sys). No need what so ever to poke at the IRQ subsystem.
Also, if you use the /proc interface to change smp_affinity on an interrupt without any of these changes, irqbalance will override it on its next poll interval. This also is not desirable.
This all sounds backwards.. we've got a perfectly functional interface for affinity -- which people object to being used for some reason. So you add another interface on top, and that is ok? All the while not CC'ing the IRQ folks,.. brilliant approach.