Thread (2 messages) 2 messages, 2 authors, 2019-10-26

Re: loop nesting in alignment exception and machine check

From: Christophe Leroy <hidden>
Date: 2019-10-26 11:20:16
Also in: linux-arch, lkml

Hi,

Le 26/10/2019 à 09:23, Wangshaobo (bobo) a écrit :
Hi,

I encountered a problem about a loop nesting occurred in manufacturing 
the alignment exception in machine check, trigger background is :

problem:

machine checkout or critical interrupt ->…->kbox_write[for recording 
last words] -> memcpy(irremap_addr, src,size):_GLOBAL(memcpy)…

when we enter memcpy,a command ‘dcbz r11,r6’ will cause a alignment 
exception, in this situation,r11 loads the ioremap address,which leads 
to the alignment exception,
You can't use memcpy() on something else than memory.

For an ioremapped area, you have to use memcpy_toio()

Christophe
then the command can not be process successfully,as we still in machine 
check.at the end ,it triggers a new irq machine check in irq handler 
function,a loop nesting begins.

analysis:

We have analysed a lot,but it still can not come to a reasonable 
description,in common,the alignment triggered in machine check context 
can still be collected into the Kbox

after alignment exception be handled by handler function, but how does 
the machine checkout can be triggered in the handler fucntion for any 
causes? We print relevant registers

as follow when first enter machine check and alignment exception handler 
function:

          MSR:0x2      MSR:0x0

          SRR1:0x2      SRR1:0x21002

          But the manual says SRR1 should be set to MSR(0x2),why that 
happened ?

          Then a branch in handler function copy the SRR1 to MSR,this 
enble MSR[ME] and MSR[CE],system collapses.

Conclusion:

          1)  why the alignment exception can not be handled in machine 
check ?

          2)  besides memcpy,any other function can cause the alignment 
exception ?

We still recurrent it, the line as follows:

          Cpu dead lock->watch log->trigger 
fiq->kbox_write->memcpy->alignment exception->print last words.

          but for those problems as below,what the kbox printed is empty.

------------------kbox restart:[   10.147594]----------------

kbox verify fs magic fail

kbox mem mabye destroyed, format it

kbox: load OK

lock-task: major[249] minor[0]

-----start show_destroyed_kbox_mem_head----

00000000: 00000000 00000000 00000000 00000000  ................

00000010: 00000000 00000000 00000000 00000000  ................

00000020: 00000000 00000000 00000000 00000000  ................

00000030: 00000000 00000000 00000000 00000000  ................

00000040: 00000000 00000000 00000000 00000000  ................

00000050: 00000000 00000000 00000000 00000000  ................

00000060: 00000000 00000000 00000000 00000000  ................

00000070: 00000000 00000000 00000000 00000000  ................

00000080: 00000000 00000000 00000000 00000000  ................

00000090: 00000000 00000000 00000000 00000000  ................
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help