Thread (23 messages) 23 messages, 4 authors, 2015-11-23

Re: [PATCH 0/8] FP/VEC/VSX switching optimisations

From: Cyril Bur <hidden>
Date: 2015-11-18 23:01:25

On Wed, 18 Nov 2015 14:51:25 +0000
David Laight [off-list ref] wrote:
From: Cyril Bur
quoted
Sent: 18 November 2015 03:27  
...
quoted
The goal of these patches is to rework how the 'math' registers (FP, VEC
and VSX) are context switched. Currently the kernel adopts a lazy approach,
always switching userspace tasks with all three facilities disabled and
loads in each set of registers upon receiving each unavailable exception.
The kernel does try to avoid disabling the features in the syscall quick
path but it during testing it appears that even what should be a simple
syscall still causes the kernel to use some facilities (vectorised memcpy
for example) for its self and therefore disable it for the user task.  
Hi David,
Perhaps the kernel should be avoiding using these registers?
I wonder if the gain from using vectorised memcpy is typically
enough to warrant the cost of the save and restore?
Yeah, on smaller copies that might be the way to go.
There may even be scope for kernel code doing a save/restore
of a small number of registers onto an in-stack save area.
This has been thrown up in the air, there's also the volatile/non-volatiles to
consider and the caveat that glibc doesn't quite respect the ABI here.

As it turns out (and no one is more surprised than me), despite the other
attempts at optimising, this series really has boiled down to removing the need
for processes to take the facility unavailable interrupts.

I do plan to carry on with optimising in this area and will have a look to see
what I can do.

Cyril
It would need to be linked to the data of the thread
that owns the fpu registers so that a save request could
be honoured.
Pre-emption would probably need to be disabled, but nested
use, and use from ISR should be ok.

	David
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help