Re: Any reason to use put_page in slub.c?
From: Glauber Costa <hidden>
Date: 2012-08-01 12:45:58
On 07/31/2012 06:52 PM, James Bottomley wrote:
On Tue, 2012-07-31 at 09:31 -0500, Christoph Lameter wrote:quoted
On Tue, 31 Jul 2012, Glauber Costa wrote:quoted
On 07/31/2012 06:17 PM, Christoph Lameter wrote:quoted
On Tue, 31 Jul 2012, Glauber Costa wrote:quoted
On 07/31/2012 06:09 PM, Christoph Lameter wrote:quoted
That is understood. Typically these object where page sized though and various assumptions (pretty dangerous ones as you are finding out) are made regarding object reuse. The fallback of SLUB for higher order allocs to the page allocator avoids these problems for higher order pages.omg...I would be very thankful if you would go through the tree and check for any remaining use cases like that. Would take care of your problem.I would be happy to do it. Do you have any example of any user that behaved like this in the past, so I can search for something similar? This can potentially take many forms, and auditing every kfree out there is not humanly possible. The best I can do is to search for known patterns here...The basic problem is that someone will take the address of an object that is allocated via slab and then access the page struct to increase the page count. So you would see page = virt_to_page(<slab_object>); get_page(page); The main cuprit in the past has been the DMA code in the SCSI layer. I think it was the first 512 byte control block for the device that was the main issue. There was a discussion betwen Hugh Dickins and me when SLUB was first released about this issue and it resulted in some changes so that certain fields in the page struct were not touched by SLUB since they were needed for I/O.Hey, don't try to pin this on me. We don't use get_page() at all on the ordinary DMA route. There are four get_page() calls in the whole of drivers/scsi. One is in the sg.c fault path, which looks genuine. The other three are in fcoe and iSCSI ... what they're trying to do is to ensure that the page hangs around until the device sees the data in a network tx path. I can't see why any of these pages would come from kmalloc() or any other slab object since they should all be user pages. James -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
On 07/31/2012 06:52 PM, James Bottomley wrote:
On Tue, 2012-07-31 at 09:31 -0500, Christoph Lameter wrote:quoted
On Tue, 31 Jul 2012, Glauber Costa wrote:quoted
On 07/31/2012 06:17 PM, Christoph Lameter wrote:quoted
On Tue, 31 Jul 2012, Glauber Costa wrote:quoted
On 07/31/2012 06:09 PM, Christoph Lameter wrote:quoted
That is understood. Typically these object where page sized though and various assumptions (pretty dangerous ones as you are finding out) are made regarding object reuse. The fallback of SLUB for higher order allocs to the page allocator avoids these problems for higher order pages.omg...I would be very thankful if you would go through the tree and check for any remaining use cases like that. Would take care of your problem.I would be happy to do it. Do you have any example of any user that behaved like this in the past, so I can search for something similar? This can potentially take many forms, and auditing every kfree out there is not humanly possible. The best I can do is to search for known patterns here...The basic problem is that someone will take the address of an object that is allocated via slab and then access the page struct to increase the page count. So you would see page = virt_to_page(<slab_object>); get_page(page); The main cuprit in the past has been the DMA code in the SCSI layer. I think it was the first 512 byte control block for the device that was the main issue. There was a discussion betwen Hugh Dickins and me when SLUB was first released about this issue and it resulted in some changes so that certain fields in the page struct were not touched by SLUB since they were needed for I/O.Hey, don't try to pin this on me. We don't use get_page() at all on the ordinary DMA route. There are four get_page() calls in the whole of drivers/scsi. One is in the sg.c fault path, which looks genuine. The other three are in fcoe and iSCSI ... what they're trying to do is to ensure that the page hangs around until the device sees the data in a network tx path. I can't see why any of these pages would come from kmalloc() or any other slab object since they should all be user pages.
I've audited all users of get_page() in the drivers/ directory for patterns like this. In general, they kmalloc something like a table of entries, and then get_page() the entries. The entries are either user pages, pages allocated by the page allocator, or physical addresses through their pfn (in 2 cases from the vga ones...) I took a look about some other instances where virt_to_page occurs together with kmalloc as well, and they all seem to fall in the same category. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>