How to measure performance inside Kernel?
From: Mulyadi Santosa <hidden>
Date: 2012-02-12 11:46:42
Hi Peter... On Sat, Feb 11, 2012 at 20:57, Peter Senna Tschudin [off-list ref] wrote:
Graeme, I found a problem on my code. I was calling kmalloc() only once for both portions of code. The result is that the first loop that accessed the memory was finding some penalty. Now I'm calling independent kmalloc for each test.
Sorry for jumping in the mid of discussion :) I read your code and I think kmalloc can be streamlined here. I recommend that kmalloc() allocate total memory needed to handle whole q->buf[] array. something like (CMIIW): q->buf=kmalloc(sizeof(struct vb_buffer)*q->num_buffers,GFP_KERNEL) then access q->buf[1], q->buf[2] etc. This way, AFAIK, you will likely get not only virtually continous pages, but also physical continous pages. And that will ease prefetching into L1/L2 cache. -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com