Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu
From: Jens Axboe <axboe@kernel.dk>
Date: 2017-06-28 22:19:11
Also in:
dm-devel
On 06/28/2017 04:07 PM, Brian King wrote:
On 06/28/2017 04:59 PM, Jens Axboe wrote:quoted
On 06/28/2017 03:54 PM, Jens Axboe wrote:quoted
On 06/28/2017 03:12 PM, Brian King wrote:quoted
-static inline int part_in_flight(struct hd_struct *part) +static inline unsigned long part_in_flight(struct hd_struct *part) { - return atomic_read(&part->in_flight[0]) + atomic_read(&part->in_flight[1]); + return part_stat_read(part, in_flight[0]) + part_stat_read(part, in_flight[1]);One obvious improvement would be to not do this twice, but only have to loop once. Instead of making this an array, make it a structure with a read and write count. It still doesn't really fix the issue of someone running on a kernel with a ton of possible CPUs configured. But it does reduce the overhead by 50%.Or something as simple as this: #define part_stat_read_double(part, field1, field2) \ ({ \ typeof((part)->dkstats->field1) res = 0; \ unsigned int _cpu; \ for_each_possible_cpu(_cpu) { \ res += per_cpu_ptr((part)->dkstats, _cpu)->field1; \ res += per_cpu_ptr((part)->dkstats, _cpu)->field2; \ } \ res; \ }) static inline unsigned long part_in_flight(struct hd_struct *part) { return part_stat_read_double(part, in_flight[0], in_flight[1]); }I'll give this a try and also see about running some more exhaustive runs to see if there are any cases where we go backwards in performance. I'll also run with partitions and see how that impacts this.
And do something nuts, like setting NR_CPUS to 512 or whatever. What do distros ship with? -- Jens Axboe