Re: [PATCH 5/9] HWPoison: add memory_failure_queue()

[PATCH 0/9] ACPI, APEI patches for 2.6.40 · Huang Ying <hidden> · 2011-05-17
[PATCH 2/9] lib, Add lock-less NULL terminated single list · Huang Ying <hidden> · 2011-05-17
[PATCH 1/9] Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG · Huang Ying <hidden> · 2011-05-17
[PATCH 3/9] lib, Make gen_pool memory allocator lockless · Huang Ying <hidden> · 2011-05-17
[PATCH 9/9] ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled · Huang Ying <hidden> · 2011-05-17
[PATCH 8/9] ACPI, APEI, GHES: Add PCIe AER recovery support · Huang Ying <hidden> · 2011-05-17
[PATCH 7/9] PCIe, AER, add aer_recover_queue · Huang Ying <hidden> · 2011-05-17
Re: [PATCH 7/9] PCIe, AER, add aer_recover_queue · Jesse Barnes <hidden> · 2011-06-01
Re: [PATCH 7/9] PCIe, AER, add aer_recover_queue · Huang Ying <hidden> · 2011-06-02
Re: [PATCH 7/9] PCIe, AER, add aer_recover_queue · Jesse Barnes <hidden> · 2011-06-02
[PATCH 6/9] ACPI, APEI, GHES: Add hardware memory error recovery support · Huang Ying <hidden> · 2011-05-17
[PATCH 5/9] HWPoison: add memory_failure_queue() · Huang Ying <hidden> · 2011-05-17
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-17
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Huang Ying <hidden> · 2011-05-17
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-17
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Huang Ying <hidden> · 2011-05-18
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-20
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · huang ying <hidden> · 2011-05-22
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-22
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · huang ying <hidden> · 2011-05-22
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-22
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Huang Ying <hidden> · 2011-05-23
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-23
RE: [PATCH 5/9] HWPoison: add memory_failure_queue() · "Luck, Tony" <tony.luck@intel.com> · 2011-05-23
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-25
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Huang Ying <hidden> · 2011-05-24
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-24
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Huang Ying <hidden> · 2011-05-24
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-24
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Hidetoshi Seto <hidden> · 2011-05-25
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Ingo Molnar <hidden> · 2011-05-25
Re: [PATCH 5/9] HWPoison: add memory_failure_queue() · Hidetoshi Seto <hidden> · 2011-05-26
[PATCH 4/9] ACPI, APEI, GHES, printk support for recoverable error via NMI · Huang Ying <hidden> · 2011-05-17
Re: [PATCH 0/9] ACPI, APEI patches for 2.6.40 · Len Brown <lenb@kernel.org> · 2011-05-29
Re: [PATCH 0/9] ACPI, APEI patches for 2.6.40 · huang ying <hidden> · 2011-05-29
Re: [PATCH 0/9] ACPI, APEI patches for 2.6.40 · Chen Gong <hidden> · 2011-05-30

From: Ingo Molnar <hidden>
Date: 2011-05-23 11:02:15
Also in: lkml

* Huang Ying [off-list ref] wrote:

quoted

That's where 'active filters' come into the picture - see my other mail 
(that was in the context of unidentified NMI errors/events) where i 
outlined how they would work in this case and elsewhere. Via active filters 
we could share most of the code, gain access to the events and still have 
kernel driven policy action.

Is that something as follow?

- NMI handler run for the hardware error, where hardware error
  information is collected and put into perf ring buffer as 'event'.

Correct.

Note that for MCE errors we want the 'persistent event' framework Boris has 
posted: we want these events to be buffered up to a point even if there is no 
tool listening in on them:

 - this gives us boot-time MCE error coverage

 - this protects us against a logging daemon being restarted and events
   getting lost

- Some 'active filters' are run for each 'event' in NMI context.

Yeah. Whether it's a human-ASCII space 'filter' or really just a callback you 
register with that event is secondary - both would work.

- Some operations can not be done in NMI handler, so they are delayed to
  an IRQ handler (can be done with something like irq_work).

Yes.

- Some other 'active filters' are run for each 'event' in IRQ context.
  (For memory error, we can call memory_failure_queue() here).

Correct.

Where some 'active filters' are kernel built-in, some 'active filters' can be 
customized via kernel command line or by user space.

Yes.

If my understanding as above is correct, I think this is a general and 
complex solution.  It is a little hard for user to understand which 'active 
filters' are in effect.  He may need some runtime assistant to understand the 
code (maybe /sys/events/active_filters, which list all filters in effect 
now), because that is hard only by reading the source code.  Anyway, this is 
a design style choice.

I don't think it's complex: the built-in rules are in plain sight (can be in 
the source code or can even be explicitly registered callbacks), the 
configuration/tooling installed rules will be as complex as the admin or tool 
wants them to be.

There are still some issues, I don't know how to solve in above framework.

- If there are two processes request the same type of hardware error
  events.  One hardware error event will be copied to two ring buffers (each 
  for one process), but the 'active filters' should be run only once for each 
  hardware error event.

With persistent events 'active filters' should only be attached to the central 
persistent event.

- How to deal with ring-buffer overflow?  For example, there is full of 
  corrected memory error in ring-buffer, and now a recoverable memory error 
  occurs but it can not be put into perf ring buffer because of ring-buffer 
  overflow, how to deal with the recoverable memory error?

The solution is to make it large enough. With *every* queueing solution there 
will be some sort of queue size limit.

Thanks,

	Ingo

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help