Thread (14 messages) 14 messages, 3 authors, 2016-05-06

Re: [PATCH 0/9] FP/VEC/VSX switching optimisations

From: Naveen N. Rao <hidden>
Date: 2016-05-06 10:48:51

On 2016/05/05 05:32PM, Naveen N Rao wrote:
quoted hunk ↗ jump to hunk
On 2016/02/29 05:53PM, Cyril Bur wrote:
quoted
Cover-letter for V1 of the series is at
https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-November/136350.html

Cover-letter for V2 of the series is at
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/138054.html

Changes in V3:
Addressed review comments from Michael Neuling
 - Made commit message in 4/9 better reflect the patch
 - Removed overuse of #ifdef blocks and redundant condition in 5/9
 - Split 6/8 in two to better prepare for 7,8,9
 - Removed #ifdefs in 6/9

Changes in V4:
 - Addressed non ABI compliant ASM macros in 1/9
 - Fixed build breakage due to changing #ifdefs in V3 (6/9)
 - Reordered some conditions in if statements

Changes in V5:
 - Enhanced basic-asm.h to provide ABI independent macro as pointed out by
   Naveen Rao.
   - Tested for both BE and LE builds. Had to disable -flto from the
     selftests/powerpc Makefile as it didn't play well with the custom ASM.
 - Added some extra debugging output to the vmx_signal testcase
 - Fixed comments in testing code
 - Updated VSX test code to use GCC Altivec macros

Changes in V6:
 - Removed recursive definition of CFLAGS in math/Makefile
 - Corrected the use of the word param in favour of doubleword
 - Reordered some code in basic-asm.h and neatened some comments
This series is resulting in a kernel crash with one of the perf tests.  
To reproduce, build perf and run the test for breakpoint overflow signal 
handler.

# ./perf test -v 17
17: Test breakpoint overflow signal handler                  :
--- start ---
test child forked, pid 3753
failed opening event 0
failed opening event 0
cpu 0xd: Vector: 600 (Alignment) at [c0000000edd738c0]
    pc: c00000000000a818: save_fpu+0xa8/0x2ac
    lr: c00000000001568c: __giveup_fpu+0x2c/0x90
    sp: c0000000edd73b40
   msr: 800000000280b033
   dar: c0000000edc436e0
 dsisr: 42000000
  current = 0xc0000000edc42c00
  paca    = 0xc000000007e82700	 softe: 0	 irq_happened: 0x01
    pid   = 3753, comm = perf
Linux version 4.6.0-rc3-nnr+ (root@rhel71le) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-8) (GCC) ) #93 SMP Wed May 4 22:01:06 IST 2016
enter ? for help
[link register   ] c00000000001568c __giveup_fpu+0x2c/0x90
[c0000000edd73b40] 0000000000000000 (unreliable)
[c0000000edd73b70] c000000000015730 giveup_fpu+0x40/0xa0
[c0000000edd73ba0] c000000000015810 flush_fp_to_thread+0x80/0x90
[c0000000edd73bd0] c000000000026b3c setup_sigcontext.constprop.3+0xbc/0x1f0
[c0000000edd73c30] c0000000000274c4 handle_rt_signal64+0x3b4/0x7c0
[c0000000edd73d10] c000000000017ee0 do_signal+0x150/0x2b0
[c0000000edd73e00] c000000000018220 do_notify_resume+0xd0/0x110
[c0000000edd73e30] c000000000009844 ret_from_except_lite+0x70/0x74
--- Exception: 900 (Decrementer) at 00000000100b3c88
SP (3fffd08cfb20) is in userspace
d:mon> ls save_fpu
save_fpu: c00000000000a770

With v4.5, the test would fail, but not cause what looks to be an 
alignment exception.
xmon couldn't decode the instructions:

d:mon>
c00000000000a810  38800000  li  r4,0
c00000000000a814  f0000250  .long 0xfffffffff0000250
c00000000000a818  7c062798  .long 0x7c062798
c00000000000a81c  f0000250  .long 0xfffffffff0000250
c00000000000a820  38800010  li  r4,16
c00000000000a824  f0210a50  .long 0xfffffffff0210a50
c00000000000a828  7c262798  .long 0x7c262798
c00000000000a82c  f0210a50  .long 0xfffffffff0210a50
c00000000000a830  38800020  li  r4,32
c00000000000a834  f0421250  .long 0xfffffffff0421250

However, with objdump, the instructions look to be ok:

c00000000000aa10 <save_fpu+0x2a0>
c00000000000a810:   00 00 80 38     li      r4,0
c00000000000a814:   50 02 00 f0     xxswapd vs0,vs0
c00000000000a818:   98 27 06 7c     stxvd2x vs0,r6,r4
c00000000000a81c:   50 02 00 f0     xxswapd vs0,vs0
c00000000000a820:   10 00 80 38     li      r4,16
c00000000000a824:   50 0a 21 f0     xxswapd vs1,vs1
c00000000000a828:   98 27 26 7c     stxvd2x vs1,r6,r4
c00000000000a82c:   50 0a 21 f0     xxswapd vs1,vs1

I saw this on a LE vm on Power7 and that looks to be the issue, since a 
BE vm does not show this. I'm attaching the .config in case it helps.


- Naveen

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help