Re: [PATCH net-next v2] declance: Remove IRQF_ONESHOT
From: "Maciej W. Rozycki" <macro@orcam.me.uk>
Date: 2026-01-27 18:35:47
Also in:
linux-mips
On Tue, 27 Jan 2026, Sebastian Andrzej Siewior wrote:
quoted
Umm, are they? It's been a while, but have I missed anything? As I recall ->irq_mask() is called exactly so that interrupts can be re-enabled at the CPU level right way so as not to block other sources which may have a low latency requirement while a hardirq handler is running.A hardirq is always serviced with disabled interrupts from CPU point of view. There were exceptions in the IDE department as far as I remember where it was possible to enable interrupts from CPUs point of view while the interrupt was serviced. That was 2.2/2.4 time frame and edge interrupts made it possible.
Fair enough.
It is not required to mask the interrupt while the handler is invoked unless it is required to properly ACK the interrupt from device's and IRQ-chip point of view. The sole purpose of IRQF_ONESHOT was to disable the interrupt source while the threaded interrupt is running. This driver has none.
Ack.
quoted
Since this is a level-triggered interrupt unmasking it before the EOI will make it retrigger right away and loop forever. And as the description of commit 0fabe1021f8b ("MIPS: DECstation I/O ASIC DMA interrupt classes") says the interrupt must not be acked before EOI, as the problematic transfer would restart while the IRQ handler is still running:If it is a level interrupt and MASK/ACK in the irq-chip then the device will issue the interrupt again if the source of the interrupt (i.e. the error) has not been solved. Since the handler does only a printk() it should trigger again.
It won't retrigger, as a memory read error event is transient, e.g. a bus timeout or an uncorrected ECC error, and the resulting interrupt is only recorded internally in the IOASIC. IOW, the interrupt is edge-triggered and the interrupt status bit is *the* origin of the outgoing interrupt to the CPU. Once cleared the IRQ is gone until the next error happens. But it cannot be cleared early as with ordinary edge-triggered interrupts, because of the dual function of the interrupt status bit, as described in the comment referring IRQF_ONESHOT in arch/mips/dec/ioasic-irq.c. FWIW it seems at the very least the handler ought to report the offending address and rate-limit the message. Since this is a networking driver the resulting data corruption is probably OK as that can happen to transferred data anywhere en route and it's up to the high-level receiver to cope with that. Given that it is outgoing data there's little that can be done too. Maybe the whole data transfer can be aborted, but I'm not familiar enough with LANCE hardware to know off-hand how to do that. As I say, this is exceedingly rare an event, so I guess taking the path of least resistance make sense.
quoted
16 R/W0C LANCE DMA memory read error This bit is set to 1 and DMA disabled, when the LANCE DMA encounters a memory read error. The LANCE then times out and interrupts the processor, which handles the problem. The LPR can read the address of the error. The bit may be cleared by writing a 0; writing a 1 has no effect. so a combined ACK-EOI is the only correct way to handle it.Make sense. So first the driver needs to set this bit and then IRQ-chip would need to see the EOI after the handler run. You would still need to cancel/ tear down the transfer before that.
This is R/W0C, that is read/write-zero-to-clear. Only hardware can set the bit to 1 and only software can set the bit to 0; any attempt made by software to flip the bit from 0 to 1 is ignored. This guarantees write atomicity: you can ack any DMA interrupt(s) by just a single 32-bit write to the register with no need to read it first. Upon a DMA error the hardware writes 1, which asserts the IRQ and stops DMA. Only having handled the situation the handler is supposed to write 0, which deasserts the IRQ and resumes DMA. FAOD the IOASIC has IRQ and (third-party) DMA controller features both on chip, providing for the double function of the DMA interrupt status bits; they are all handled internally. Non-DMA interrupts work the usual way: when the originating external device has deasserted its IRQ line, the corresponding interrupt status bit, which is R/O rather than R/W0C, gets cleared accordingly, deasserting the IOASIC's IRQ output for the CPU.
quoted
Yes, the handler is sort of a placeholder, but the structure of handling ought to be correct regardless. One issue here is this is a recovery handler for an error scenario that is not supposed to happen with a correctly operating system. I've never seen it actually fire, which is also why there's no actual recovery action implemented. Perhaps such an error could be induced for verification purposes, I don't know. All in all this code may have to rely solely on hw specs. I need more data to conclude whether this is the right change to make I'm afraid. Thank you for looking into it though.Fair enough. Would it work for you if we scratch this from net-next and you route this or something else via the mips tree?
No need to, I think I understand the situation now. Surely the comment referring IRQF_ONESHOT in arch/mips/dec/ioasic-irq.c needs to be removed, but otherwise this is: Acked-by: Maciej W. Rozycki <macro@orcam.me.uk> Thank you for clarifying this to me, and doing the clean-up in the first place! Maciej