Re: [PATCH net-next v2] declance: Remove IRQF_ONESHOT

From: "Maciej W. Rozycki" <macro@orcam.me.uk>
Date: 2026-01-27 18:35:47
Also in: linux-mips

On Tue, 27 Jan 2026, Sebastian Andrzej Siewior wrote:

quoted

 Umm, are they?  It's been a while, but have I missed anything?  As I 
recall ->irq_mask() is called exactly so that interrupts can be re-enabled 
at the CPU level right way so as not to block other sources which may have 
a low latency requirement while a hardirq handler is running.

A hardirq is always serviced with disabled interrupts from CPU point of
view. There were exceptions in the IDE department as far as I remember
where it was possible to enable interrupts from CPUs point of view while
the interrupt was serviced. That was 2.2/2.4 time frame and edge
interrupts made it possible.

 Fair enough.

It is not required to mask the interrupt while the handler is invoked
unless it is required to properly ACK the interrupt from device's and
IRQ-chip point of view.
The sole purpose of IRQF_ONESHOT was to disable the interrupt source
while the threaded interrupt is running. This driver has none.

 Ack.

quoted

 Since this is a level-triggered interrupt unmasking it before the EOI 
will make it retrigger right away and loop forever.  And as the 
description of commit 0fabe1021f8b ("MIPS: DECstation I/O ASIC DMA 
interrupt classes") says the interrupt must not be acked before EOI, as 
the problematic transfer would restart while the IRQ handler is still 
running:

If it is a level interrupt and MASK/ACK in the irq-chip then the device
will issue the interrupt again if the source of the interrupt (i.e. the
error) has not been solved. Since the handler does only a printk() it
should trigger again.

 It won't retrigger, as a memory read error event is transient, e.g. a bus 
timeout or an uncorrected ECC error, and the resulting interrupt is only 
recorded internally in the IOASIC.  IOW, the interrupt is edge-triggered 
and the interrupt status bit is *the* origin of the outgoing interrupt to 
the CPU.  Once cleared the IRQ is gone until the next error happens.  But 
it cannot be cleared early as with ordinary edge-triggered interrupts, 
because of the dual function of the interrupt status bit, as described in 
the comment referring IRQF_ONESHOT in arch/mips/dec/ioasic-irq.c.

 FWIW it seems at the very least the handler ought to report the offending 
address and rate-limit the message.  Since this is a networking driver the 
resulting data corruption is probably OK as that can happen to transferred 
data anywhere en route and it's up to the high-level receiver to cope with 
that.  Given that it is outgoing data there's little that can be done too.  
Maybe the whole data transfer can be aborted, but I'm not familiar enough 
with LANCE hardware to know off-hand how to do that.  As I say, this is 
exceedingly rare an event, so I guess taking the path of least resistance 
make sense.

quoted

16 R/W0C LANCE DMA memory read error
         This bit is set to 1 and DMA disabled, when the LANCE DMA
         encounters a memory read error. The LANCE then times out and
         interrupts the processor, which handles the problem. The LPR can
         read the address of the error. The bit may be cleared by writing a 0;
         writing a 1 has no effect.

so a combined ACK-EOI is the only correct way to handle it.

Make sense. So first the driver needs to set this bit and then IRQ-chip
would need to see the EOI after the handler run. You would still need to
cancel/ tear down the transfer before that.

 This is R/W0C, that is read/write-zero-to-clear.  Only hardware can set 
the bit to 1 and only software can set the bit to 0; any attempt made by 
software to flip the bit from 0 to 1 is ignored.  This guarantees write 
atomicity: you can ack any DMA interrupt(s) by just a single 32-bit write 
to the register with no need to read it first.

 Upon a DMA error the hardware writes 1, which asserts the IRQ and stops 
DMA.  Only having handled the situation the handler is supposed to write 
0, which deasserts the IRQ and resumes DMA.

 FAOD the IOASIC has IRQ and (third-party) DMA controller features both on 
chip, providing for the double function of the DMA interrupt status bits; 
they are all handled internally.  Non-DMA interrupts work the usual way: 
when the originating external device has deasserted its IRQ line, the 
corresponding interrupt status bit, which is R/O rather than R/W0C, gets 
cleared accordingly, deasserting the IOASIC's IRQ output for the CPU.

quoted

 Yes, the handler is sort of a placeholder, but the structure of handling 
ought to be correct regardless.

 One issue here is this is a recovery handler for an error scenario that 
is not supposed to happen with a correctly operating system.  I've never 
seen it actually fire, which is also why there's no actual recovery action 
implemented.  Perhaps such an error could be induced for verification 
purposes, I don't know.  All in all this code may have to rely solely on 
hw specs.

 I need more data to conclude whether this is the right change to make I'm 
afraid.  Thank you for looking into it though.

Fair enough. Would it work for you if we scratch this from net-next and
you route this or something else via the mips tree?

 No need to, I think I understand the situation now.  Surely the comment 
referring IRQF_ONESHOT in arch/mips/dec/ioasic-irq.c needs to be removed, 
but otherwise this is:

Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>

Thank you for clarifying this to me, and doing the clean-up in the first 
place!

  Maciej

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help