On Friday 16 September 2005 12:40, luothing@sina.com wrote:
<pre>
Hi, Blaisorblade:
I have spend some time to test recent uml 2.6.13, and find that
recent uml have improved at some aspect, but still have some potential
problem at syscall, mainly because weakly check, the attchement
include my test result, also some general reason, please check it.
Ok, I've verified the result you described (with the old 20041104 release, but
the tests you mention either had the same problems or didn't exist in that
release), and debugged it (yes, in TT mode, with GDB 6.3, which was quite a
surprise since GDB hasn't been working for that - it seems that my fix for
SKAS mode, i.e. disabling the early execve(), fixes this too - I'll check
this now).
Patches are attached against 2.6.13 - apply uml-fault-micro-cleanups before
the rest. They'll be all included in 2.6.13-bs1. Btw, what's your name? I'd
like to credit you in the changelog.
I also fixed the modify_ldt01 problem (trivial missing break;), and
modify_ldt02 doesn't create host problems here (but it's an older release) -
it just doesn't work in SKAS0 (which is expectable right now, since we use
the unimplemented PTRACE_LDT, but will be fixed), I already fixed mprotect02
previously (not yet committed).
The problem I found is that on general protection faults (not page faults)
from kernel space, we forget to handle the particular case (we refuse to take
GPF for userspace, but forget for kernelspace). And when you pass -1, it
gives a protection fault (either because it reads 0 or because it exceeds the
segment limits).
And we instead use the CR2 from host (which, for general protection faults, is
undefined) and try to fix a page fault on it.
Since that address is mapped (it was there from a previous page fault,
probably), we "succeed" and finish the handling, the instruction is retried
and we fail.
The same thing would happen in SKAS mode too, but in SKAS mode we walk first
the pagetables, and access_ok disallows this from the very beginning.
In fact, beyond this problem, we also fail to check whether the faulting
address is under TASK_SIZE in TT mode on read accesses:
#define access_ok_tt(type, addr, size) \
((type == VERIFY_READ) || (segment_eq(get_fs(), KERNEL_DS)) || \
(((unsigned long) (addr) <= ((unsigned long) (addr) + (size))) && \
(under_task_size(addr, size) || is_stack(addr, size))))
See "(type == VERIFY_READ) || "do some real testing"? That's totally bogus.
Jeff, what's that for? Not only the user can read on its own from kernel
memory, we turn that into a feature and allow that as syscall parameter too?
Waiting for an answer before fixing.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade