what is __get_cpu_var() ?
From: Dave Hylands <hidden>
Date: 2011-02-23 18:30:15
HI Murali, On Wed, Feb 23, 2011 at 11:26 AM, Murali N [off-list ref] wrote:
Hi Dave, On Wed, Feb 23, 2011 at 11:15 AM, Dave Hylands [off-list ref] wrote:quoted
Hi Murali, On Wed, Feb 23, 2011 at 10:34 AM, Murali N [off-list ref] wrote:quoted
Hi Dave, thanks for your reply....snip...quoted
quoted
get_cpu_var returns the contents of a per-cpu variable. __get_cpu_var contains the actual machine-dependant implementation. It looks like all of the architectures use the one in asm-generic/percpu.h In general, all of the per-cpu data is gathered together into a section. Multiple sections are allocated (one per CPU). I think that the address of the variable is really the offset within the section, and each allocated section is cache-line aligned. This offset is then added to the "offset for my cpu" to come up with the final address of the variable, which is dereferenced as a pointer dereference. There are lots of extra doo-dads to get around warnings, and to prevent the linker from producing relocation references for for the variable access (since it looks like an access of a global variable, but it's really just doing a game of using the offset of the variable within the section). So you could think of it as a very fancy offsetof macro. There are several other macros involved, perhaps you could be a bit more specific about your request? Dave HylandsI have one more basic question. Why would we need to maintain structures like this? Is there any advantage we get here?Primarily for performance reasons. For example, the kernel maintains lots of stats on threads and processes (I haven't looked to see if these are actually maintained on a per-cpu basis, but the concept applies). these stats are updated frequently, but only accessed occaisonally. If you have a global "database" of stats, then each CPU needs to lock the data, which creates lots of contention. By keeping stuff per-cpu, the cpus don't need to acquire any locks (or at the very least won't cause as much contention when acquiring per-cpu locks). This becomes especially important when there are lots of cpus. The query functions can then amalgamate the information and present it as if it were maintained in a global database. So if you have data which is updated frequently and only accessed occaisonally, or updated infrequently and accessed frequently, then you might have a case for using per-cpu-data. Of course you'd still need to profile it and see if it makes sense. Also keep in mind, that some things might not seem like it matters much for say a dual-core, but could make a considerable difference with say 32 cores. Dave HylandsSo it make sense to use if i am running on more cores ( > 4 ).
It really depends on the access patterns of the data. Whether it makes sense or not is something you'll probably need to profile (i.e. with and without using per-cpu variables). Dave Hylands