Thread (140 messages) 140 messages, 21 authors, 2018-12-04

Re: [PATCH 10/17] prmem: documentation

From: Andy Lutomirski <luto@amacapital.net>
Date: 2018-11-13 18:36:09
Also in: linux-doc, linux-integrity, lkml

On Tue, Nov 13, 2018 at 10:26 AM Igor Stoppa [off-list ref] wrote:
On 13/11/2018 19:16, Andy Lutomirski wrote:
quoted
should be
entirely abstracted away by an appropriate API, so neither SELinux nor
IMA need to be aware that there's an mm_struct involved.
Yes, that is fine. In my proposal I was thinking about tying it to the
core/thread that performs the actual write.

The high level API could be something like:

wr_memcpy(void *src, void *dst, uint_t size)
quoted
 It's also
entirely possible that some architectures won't even use an mm_struct
behind the scenes -- x86, for example, could have avoided it if there
were a kernel equivalent of PKRU.  Sadly, there isn't.
The mm_struct - or whatever is the means to do the write on that
architecture - can be kept hidden from the API.

But the reason why I was proposing to have one mm_struct per writer is
that, iiuc, the secondary mapping is created in the secondary mm_struct
for each writer using it.

So the updating of IMA measurements would have, theoretically, also
write access to the SELinux AVC. Which I was trying to avoid.
And similarly any other write rare updater. Is this correct?
If you call a wr_memcpy() function with the signature you suggested,
then you can overwrite any memory of this type.  Having a different
mm_struct under the hood makes no difference.  As far as I'm
concerned, for *dynamically allocated* rare-writable memory, you might
as well allocate all the paging structures at the same time, so the
mm_struct will always contain the mappings.  If there are serious bugs
in wr_memcpy() that cause it to write to the wrong place, we have
bigger problems.

I can imagine that we'd want a *typed* wr_memcpy()-like API some day,
but that can wait for v2.  And it still doesn't obviously need
multiple mm_structs.
quoted
quoted
2) Iiuc, the purpose of the 2 pages being remapped is that the target of
the patch might spill across the page boundary, however if I deal with
the modification of generic data, I shouldn't (shouldn't I?) assume that
the data will not span across multiple pages.
The reason for the particular architecture of text_poke() is to avoid
memory allocation to get it working.  i think that prmem/rare_write
should have each rare-writable kernel address map to a unique user
address, possibly just by offsetting everything by a constant.  For
rare_write, you don't actually need it to work as such until fairly
late in boot, since the rare_writable data will just be writable early
on.
Yes, that is true. I think it's safe to assume, from an attack pattern,
that as long as user space is not started, the system can be considered
ok. Even user-space code run from initrd should be ok, since it can be
bundled (and signed) as a single binary with the kernel.

Modules loaded from a regular filesystem are a bit more risky, because
an attack might inject a rogue key in the key-ring and use it to load
malicious modules.
If a malicious module is loaded, the game is over.
quoted
quoted
If the data spans across multiple pages, in unknown amount, I suppose
that I should not keep interrupts disabled for an unknown time, as it
would hurt preemption.

What I thought, in my initial patch-set, was to iterate over each page
that must be written to, in a loop, re-enabling interrupts in-between
iterations, to give pending interrupts a chance to be served.

This would mean that the data being written to would not be consistent,
but it's a problem that would have to be addressed anyways, since it can
be still read by other cores, while the write is ongoing.
This probably makes sense, except that enabling and disabling
interrupts means you also need to restore the original mm_struct (most
likely), which is slow.  I don't think there's a generic way to check
whether in interrupt is pending without turning interrupts on.
The only "excuse" I have is that write_rare is opt-in and is "rare".
Maybe the enabling/disabling of interrupts - and the consequent switch
of mm_struct - could be somehow tied to the latency configuration?

If preemption is disabled, the expectations on the system latency are
anyway more relaxed.

But I'm not sure how it would work against I/O.
I think it's entirely reasonable for the API to internally break up
very large memcpys.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help