Re: [PATCH] Avoiding fragmentation through different allocator

From: Marcelo Tosatti <hidden>
Date: 2005-01-24 19:32:27
Also in: lkml

On Mon, Jan 24, 2005 at 10:44:12AM -0600, James Bottomley wrote:

On Mon, 2005-01-24 at 10:29 -0200, Marcelo Tosatti wrote:

quoted

Since the pages which compose IO operations are most likely sparse (not physically contiguous),
the driver+device has to perform scatter-gather IO on the pages. 

The idea is that if we can have larger memory blocks scatter-gather IO can use less SG list 
elements (decreased CPU overhead, decreased device overhead, faster). 

Best scenario is where only one sg element is required (ie one huge physically contiguous block).

Old devices/unprepared drivers which are not able to perform SG/IO
suffer with sequential small sized operations.

I'm far away from being a SCSI/ATA knowledgeable person, the storage people can 
help with expertise here.

Grant Grundler and James Bottomley have been working on this area, they might want to 
add some comments to this discussion.

It seems HP (Grant et all) has pursued using big pages on IA64 (64K) for this purpose.

Well, the basic advice would be not to worry too much about
fragmentation from the point of view of I/O devices.  They mostly all do
scatter gather (SG) onboard as an intelligent processing operation and
they're very good at it.

So is it valid to affirm that on average an operation with one SG element pointing to a 1MB 
region is similar in speed to an operation with 16 SG elements each pointing to a 64K 
region due to the efficient onboard SG processing?

No one has ever really measured an effect we can say "This is due to the
card's SG engine".  So, the rule we tend to follow is that if SG element
reduction comes for free, we take it.  The issue that actually causes
problems isn't the reduction in processing overhead, it's that the
device's SG list is usually finite in size and so it's worth conserving
if we can; however it's mostly not worth conserving at the expense of
processor cycles.

The bottom line is that the I/O (block) subsystem is very efficient at
coalescing (both in block space and in physical memory space) and we've
got it to the point where it's about as efficient as it can be.  If
you're going to give us better physical contiguity properties, we'll
take them, but if you spend extra cycles doing it, the chances are
you'll slow down the I/O throughput path.

OK! thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help