Module vs Kernel main performacne

Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-05-29
Module vs Kernel main performacne · Mulyadi Santosa <hidden> · 2012-05-30
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-05-30
Module vs Kernel main performacne · Mulyadi Santosa <hidden> · 2012-05-30
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-05-30
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-05-31
Module vs Kernel main performacne · Mulyadi Santosa <hidden> · 2012-05-31
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-05-31
Module vs Kernel main performacne · Chetan Nanda <hidden> · 2012-06-01
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-06-01
Module vs Kernel main performacne · peter.senna@gmail.com (Peter Senna Tschudin) · 2012-06-07
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-06-07
Module vs Kernel main performacne · peter.senna@gmail.com (Peter Senna Tschudin) · 2012-06-07
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-06-09
Module vs Kernel main performacne · peter.senna@gmail.com (Peter Senna Tschudin) · 2012-06-07
Module vs Kernel main performacne · Abu Rasheda <hidden> · 2012-06-07

From: Abu Rasheda <hidden>
Date: 2012-05-31 13:35:08

On Wed, May 30, 2012 at 10:35 PM, Mulyadi Santosa
[off-list ref] wrote:

Hi...

On Thu, May 31, 2012 at 4:44 AM, Abu Rasheda [off-list ref] wrote:

quoted

as I increase size of buffer, insns per cycle keep decreasing. Here is the data:

? ?1k 0.90 ?insns per cycle
? ?8k 0.43 ?insns per cycle
?43k 0.18 ?insns per cycle
100k 0.08 ?insns per cycle

Showing that copy_from_user is more efficient when copy data is small,
why it is so ?

you meant, the bigger the buffer, the fewer the instructions, right?

yes

Not sure why, but I am sure it will reach some peak point.

Anyway, you did kmalloc and then kfree()? I think that's why...bigger
buffer will grab large chunk from slab...and again likely it's
physically contigous. Also, it will be placed in the same cache line.

Whereas the smaller one....will hit allocate/free cycle more...thus
flushing the L1/L2 cache even more.

It seems to be doing opposite, bigger the allocation / copy longer stall is.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help