Re: [RFC PATCH v2 16/27] mm: Modify can_follow_write_pte/pmd for shadow stack
From: Dave Hansen <dave.hansen@linux.intel.com>
Date: 2018-07-19 00:06:39
Also in:
linux-api, linux-arch, linux-mm, lkml
quoted
quoted
-static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) +static inline bool can_follow_write(pte_t pte, unsigned int flags, + struct vm_area_struct *vma) { - return pte_write(pte) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); + if (!is_shstk_mapping(vma->vm_flags)) { + if (pte_write(pte)) + return true;Let me see if I can say this another way. The bigger issue is that these patches change the semantics of pte_write(). Before these patches, it meant that you *MUST* have this bit set to write to the page controlled by the PTE. Now, it means: you can write if this bit is set *OR* the shadowstack bit combination is set.Here, we only figure out (1) if the page is pointed by a writable PTE; or (2) if the page is pointed by a RO PTE (data or SHSTK) and it has been copied and it still exists. We are not trying to determine if the SHSTK PTE is writable (we know it is not).
Please think about the big picture. I'm not just talking about this patch, but about every use of pte_write() in the kernel.
quoted
That's the fundamental problem. We need some code in the kernel that logically represents the concept of "is this PTE a shadowstack PTE or a PTE with the write bit set", and we will call that pte_write(), or maybe pte_writable(). You *have* to somehow rectify this situation. We can absolutely no leave pte_write() in its current, ambiguous state where it has no real meaning or where it is used to mean _both_ things depending on context.True, the processor can always write to a page through a shadow stack PTE, but it must do that with a CALL instruction. Can we define a write operation as: MOV r1, *(r2). Then we don't have any doubt on pte_write() any more.
No, we can't just move the target. :) You can define it this way, but then you also need to go to every spot in the kernel that calls pte_write() (and _PAGE_RW in fact) and audit it to ensure it means "mov ..." and not push. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html