Re: CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks
From: Brian Silverman <hidden>
Date: 2015-03-10 00:37:06
Also in:
lkml
On Mon, Mar 9, 2015 at 12:08 PM, Sebastian Andrzej Siewior [off-list ref] wrote:
* Brian Silverman | 2015-03-05 00:16:20 [-0500]:quoted
Beforehand, 000 is spending most of its time in interrupts, but bash is doing something related to memory management on it in between. bash-14721 [000] ......1 6854.629126: rt_spin_lock <-free_pcppages_bulk bash-14721 [000] ....1.1 6854.629126: mm_page_pcpu_drain: page=ffffea000fa1aa40 pfn=4097705 order=0 migratetype=1 bash-14721 [000] ......1 6854.629127: get_pfnblock_flags_mask <-free_pcppages_bulk bash-14721 [000] ......1 6854.629127: __mod_zone_page_state <-free_pcppages_bulk bash-14721 [000] ....1.1 6854.629127: mm_page_pcpu_drain: page=ffffea000f572ac0 pfn=4021419 order=0 migratetype=0 bash-14721 [000] ......1 6854.629128: get_pfnblock_flags_mask <-free_pcppages_bulk bash-14721 [000] ......1 6854.629128: __mod_zone_page_state <-free_pcppages_bulk ... # lots more virtually identical repetitions of those last 3 lines bash-14721 [000] ....1.1 6854.629139: mm_page_pcpu_drain: page=ffffea000f481a80 pfn=4005994 order=0 migratetype=1 bash-14721 [000] ......1 6854.629139: get_pfnblock_flags_mask <-free_pcppages_bulkYou free memory and hold the zone->lockquoted
bash-14721 [000] d.....1 6854.629139: do_IRQ <-ret_from_intr bash-14721 [000] d.....1 6854.629139: irq_enter <-do_IRQ ... # wakes up the can1 ISR thread on 001 and the can0 one on 000 (same physical IRQ line) bash-14721 [000] d...3.1 6854.629261: sched_switch: prev_comm=bash prev_pid=14721 prev_prio=120 prev_state=R+ ==> next_comm=irq/18-can0 next_pid=2015 next_prio=28I would assume that this one raises NET_RX softirq. But at the bottom you have the irq handler on the other CPU which confuses me…
There wasn't actually any traffic on can0 for this test, so it didn't. The can0 ISR only makes a few reads/writes to the device, never calling netif_rx. The can1 handler (which actually raises a NET_RX softirq) runs on 001 because it's pinned there.
quoted
... # runs the can0 ISR irq/18-can0-2015 [000] d...3.. 6854.629283: sched_switch: prev_comm=irq/18-can0 prev_pid=2015 prev_prio=28 prev_state=S ==> next_comm=ksoftirqd/0 next_pid=3 next_prio=98 ... ksoftirqd/0-3 [000] ....1.1 6854.629291: softirq_entry: vec=1 [action=TIMER] ... ksoftirqd/0-3 [000] ....1.1 6854.629293: softirq_exit: vec=1 [action=TIMER]only the timer since nobody raised NET_RX
Correct. I included that as context for what 000 spent its time doing with the lock held, not implying that it's incorrect.
quoted
... ksoftirqd/0-3 [000] ....... 6854.629298: schedule <-smpboot_thread_fn ... ksoftirqd/0-3 [000] d...3.. 6854.629304: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=98 prev_state=S ==> next_comm=bash next_pid=14721 next_prio=28 ... bash-14721 [000] d...1.1 6854.629308: smp_trace_reschedule_interrupt <-trace_reschedule_interrupt # Actually unnecessary schedule IPI from 001? bash-14721 [000] d...1.1 6854.629309: irq_enter <-smp_trace_reschedule_interrupt ... bash-14721 [000] ....1.1 6854.629316: __tick_nohz_task_switch <-__schedule bash-14721 [000] ......1 6854.629316: __mod_zone_page_state <-free_pcppages_bulk bash-14721 [000] ....1.1 6854.629317: mm_page_pcpu_drain: page=ffffea000f57a900 pfn=4021924 order=0 migratetype=0 bash-14721 [000] ......1 6854.629317: get_pfnblock_flags_mask <-free_pcppages_bulk bash-14721 [000] ......1 6854.629317: __mod_zone_page_state <-free_pcppages_bulkand it continues cleaning up memory.quoted
... # more of this like it was doing before I don't see it unlocking the problematic mutex before the trace ends. The relevant part for 001 starts while it's running the can1 ISR. irq/18-can1-7228 [001] ....1.1 6854.629275: netif_rx: dev=can1 skbaddr=ffff880412d8fc00 len=16 irq/18-can1-7228 [001] ......1 6854.629275: migrate_disable <-netif_rx_internal irq/18-can1-7228 [001] ......2 6854.629275: enqueue_to_backlog <-netif_rx_internalenqueue_to_backlog() looks like packet reception so this should be handled in napi so I assume we run in NET_RX softirq
enqueue_to_backlog is the napi function that actually raises the NET_RX softirq.
quoted
irq/18-can1-7228 [001] d.....2 6854.629276: _raw_spin_lock <-enqueue_to_backlog irq/18-can1-7228 [001] d...1.2 6854.629276: __raise_softirq_irqoff <-enqueue_to_backlog irq/18-can1-7228 [001] d...1.2 6854.629276: do_raise_softirq_irqoff <-__raise_softirq_irqoff irq/18-can1-7228 [001] d...2.2 6854.629276: softirq_raise: vec=3 [action=NET_RX]
This is where the softirq gets raised.
quoted
... # continues handling the can1 interrupt irq/18-can1-7228 [001] ......6 6854.629286: rt_spin_lock <-get_page_from_freelist irq/18-can1-7228 [001] ......6 6854.629287: rt_spin_lock_slowlock <-get_page_from_freelist irq/18-can1-7228 [001] ......6 6854.629287: _raw_spin_lock <-rt_spin_lock_slowlock irq/18-can1-7228 [001] ....1.6 6854.629287: __try_to_take_rt_mutex <-rt_spin_lock_slowlock irq/18-can1-7228 [001] ....1.6 6854.629287: _raw_spin_lock_irq <-rt_spin_lock_slowlock irq/18-can1-7228 [001] d...2.6 6854.629288: _raw_spin_unlock_irq <-rt_spin_lock_slowlock irq/18-can1-7228 [001] ....1.6 6854.629288: task_blocks_on_rt_mutex <-rt_spin_lock_slowlockit might be zone->lock it goes after. It boosts the bash process which seems to free memory so it would make sense.quoted
... # rt_mutex/scheduling stuff irq/18-can1-7228 [001] d...4.6 6854.629291: sched_pi_setprio: comm=bash pid=14721 oldprio=120 newprio=28 ... # more scheduler stuff irq/18-can1-7228 [001] d...3.6 6854.629299: native_smp_send_reschedule <-rt_mutex_setprio ... # more scheduler stuff irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_fair <-__schedule irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_stop <-__schedule irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_dl <-__schedule irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_rt <-__schedule irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_fair <-__schedule irq/18-can1-7228 [001] d...2.6 6854.629308: pick_next_task_idle <-__schedule irq/18-can1-7228 [001] d...3.6 6854.629308: sched_switch: prev_comm=irq/18-can1 prev_pid=7228 prev_prio=28 prev_state=D ==> next_comm=swapper/1 next_pid=0 next_prio=120 ... <idle>-0 [001] d...1.. 6854.629319: softirq_check_pending_idle <-tick_nohz_idle_enter My tracing_off() call is in softirq_check_pending_idle, so that's it.It looks like your softirq for net_rx is getting a packet and then after raising NET_RX (again?) it blocks on a lock. In order to get this lock it boosts and schedules bash. It gets runable but on the other CPU. On CPU1 there is nothig going is nothing going and the only runable task is the idle thread. And this is probably where the warning is written because we go to idle while we should process a softirq instead.
That sounds like the issue. Doing the softirq instead of going idle in this situation seems like it means calling thread_do_softirq() from __schedule, but I don't know where the right place is. Can anybody give me some help on where exactly to check for softirqs from? Thanks, Brian