Re: [PATCH 0/9] FP/VEC/VSX switching optimisations
From: Naveen N. Rao <hidden>
Date: 2016-05-06 10:48:51
On 2016/05/05 05:32PM, Naveen N Rao wrote:
quoted hunk ↗ jump to hunk
On 2016/02/29 05:53PM, Cyril Bur wrote:quoted
Cover-letter for V1 of the series is at https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-November/136350.html Cover-letter for V2 of the series is at https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/138054.html Changes in V3: Addressed review comments from Michael Neuling - Made commit message in 4/9 better reflect the patch - Removed overuse of #ifdef blocks and redundant condition in 5/9 - Split 6/8 in two to better prepare for 7,8,9 - Removed #ifdefs in 6/9 Changes in V4: - Addressed non ABI compliant ASM macros in 1/9 - Fixed build breakage due to changing #ifdefs in V3 (6/9) - Reordered some conditions in if statements Changes in V5: - Enhanced basic-asm.h to provide ABI independent macro as pointed out by Naveen Rao. - Tested for both BE and LE builds. Had to disable -flto from the selftests/powerpc Makefile as it didn't play well with the custom ASM. - Added some extra debugging output to the vmx_signal testcase - Fixed comments in testing code - Updated VSX test code to use GCC Altivec macros Changes in V6: - Removed recursive definition of CFLAGS in math/Makefile - Corrected the use of the word param in favour of doubleword - Reordered some code in basic-asm.h and neatened some commentsThis series is resulting in a kernel crash with one of the perf tests. To reproduce, build perf and run the test for breakpoint overflow signal handler. # ./perf test -v 17 17: Test breakpoint overflow signal handler :--- start ---test child forked, pid 3753 failed opening event 0 failed opening event 0 cpu 0xd: Vector: 600 (Alignment) at [c0000000edd738c0] pc: c00000000000a818: save_fpu+0xa8/0x2ac lr: c00000000001568c: __giveup_fpu+0x2c/0x90 sp: c0000000edd73b40 msr: 800000000280b033 dar: c0000000edc436e0 dsisr: 42000000 current = 0xc0000000edc42c00 paca = 0xc000000007e82700 softe: 0 irq_happened: 0x01 pid = 3753, comm = perf Linux version 4.6.0-rc3-nnr+ (root@rhel71le) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-8) (GCC) ) #93 SMP Wed May 4 22:01:06 IST 2016 enter ? for help [link register ] c00000000001568c __giveup_fpu+0x2c/0x90 [c0000000edd73b40] 0000000000000000 (unreliable) [c0000000edd73b70] c000000000015730 giveup_fpu+0x40/0xa0 [c0000000edd73ba0] c000000000015810 flush_fp_to_thread+0x80/0x90 [c0000000edd73bd0] c000000000026b3c setup_sigcontext.constprop.3+0xbc/0x1f0 [c0000000edd73c30] c0000000000274c4 handle_rt_signal64+0x3b4/0x7c0 [c0000000edd73d10] c000000000017ee0 do_signal+0x150/0x2b0 [c0000000edd73e00] c000000000018220 do_notify_resume+0xd0/0x110 [c0000000edd73e30] c000000000009844 ret_from_except_lite+0x70/0x74--- Exception: 900 (Decrementer) at 00000000100b3c88SP (3fffd08cfb20) is in userspace d:mon> ls save_fpu save_fpu: c00000000000a770 With v4.5, the test would fail, but not cause what looks to be an alignment exception.
xmon couldn't decode the instructions: d:mon> c00000000000a810 38800000 li r4,0 c00000000000a814 f0000250 .long 0xfffffffff0000250 c00000000000a818 7c062798 .long 0x7c062798 c00000000000a81c f0000250 .long 0xfffffffff0000250 c00000000000a820 38800010 li r4,16 c00000000000a824 f0210a50 .long 0xfffffffff0210a50 c00000000000a828 7c262798 .long 0x7c262798 c00000000000a82c f0210a50 .long 0xfffffffff0210a50 c00000000000a830 38800020 li r4,32 c00000000000a834 f0421250 .long 0xfffffffff0421250 However, with objdump, the instructions look to be ok: c00000000000aa10 <save_fpu+0x2a0> c00000000000a810: 00 00 80 38 li r4,0 c00000000000a814: 50 02 00 f0 xxswapd vs0,vs0 c00000000000a818: 98 27 06 7c stxvd2x vs0,r6,r4 c00000000000a81c: 50 02 00 f0 xxswapd vs0,vs0 c00000000000a820: 10 00 80 38 li r4,16 c00000000000a824: 50 0a 21 f0 xxswapd vs1,vs1 c00000000000a828: 98 27 26 7c stxvd2x vs1,r6,r4 c00000000000a82c: 50 0a 21 f0 xxswapd vs1,vs1 I saw this on a LE vm on Power7 and that looks to be the issue, since a BE vm does not show this. I'm attaching the .config in case it helps. - Naveen
Attachments
- config-p7le-fpu [text/plain] 133328 bytes · preview