Re: [PATCH for-next 4/4] block/rnbd: Remove all likely and unlikely
From: Gioh Kim <hidden>
Date: 2021-05-04 13:05:11
On Thu, Apr 29, 2021 at 9:14 AM Gioh Kim [off-list ref] wrote:
On Wed, Apr 28, 2021 at 8:33 PM Chaitanya Kulkarni [off-list ref] wrote:quoted
On 4/27/21 23:14, Gioh Kim wrote:quoted
The IO performance test with fio after removing the likely and unlikely macros in all if-statement shows no performance drop. They do not help for the performance of rnbd. The fio test did random read on 32 rnbd devices and 64 processes. Test environment: - AMD Opteron(tm) Processor 6386 SE - 125G memory - kernel version: 5.4.86why 5.4 and not linux-block/for-next ?We have done porting only 5.4 for the server machine yet.quoted
quoted
- gcc version: gcc (Debian 8.3.0-6) 8.3.0 - Infiniband controller: InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) before read: IOPS=549k, BW=2146MiB/s read: IOPS=544k, BW=2125MiB/s read: IOPS=553k, BW=2158MiB/s read: IOPS=535k, BW=2089MiB/s read: IOPS=543k, BW=2122MiB/s read: IOPS=552k, BW=2154MiB/s average: IOPS=546k, BW=2132MiB/s after read: IOPS=556k, BW=2172MiB/s read: IOPS=561k, BW=2191MiB/s read: IOPS=552k, BW=2156MiB/s read: IOPS=551k, BW=2154MiB/s read: IOPS=562k, BW=2194MiB/s ----------- average: IOPS=556k, BW=2173MiB/s The IOPS and bandwidth got better slightly after removing likely/unlikely. (IOPS= +1.8% BW= +1.9%) But we cannot make sure that removing the likely/unlikely help the performance because it depends on various situations. We only make sure that removing the likely/unlikely does not drop the performance.Did you get a chance to collect perf numbers to see which functions are getting faster ?
Hi Chaitanya,
I ran the perf tool to find out which functions are getting faster.
But I was not able to find it.
Could you please suggest a tool or anything to check it out?
For your information, below is what I got with 'perf record fio
<options:8-device, 64-job, 60-second>'
The result before/after removing likely/unlikely looks the same.
4.15% fio [kernel.kallsyms] [k] _raw_spin_lock_irqsave
3.19% fio [kernel.kallsyms] [k] x86_pmu_disable_all
2.98% fio [rnbd_client] [k] rnbd_put_permit
2.77% fio [kernel.kallsyms] [k] find_first_zero_bit
2.49% fio [kernel.kallsyms] [k] __x86_indirect_thunk_rax
2.21% fio [kernel.kallsyms] [k] psi_task_change
2.00% fio [kernel.kallsyms] [k] gup_pgd_range
1.83% fio fio [.] 0x0000000000029048
1.78% fio [rnbd_client] [k] rnbd_get_permit
1.78% fio fio [.] axmap_isset
1.63% fio [kernel.kallsyms] [k] _raw_spin_lock
1.58% fio fio [.] fio_gettime
1.53% fio [rtrs_client] [k] __rtrs_get_permit
1.51% fio [rnbd_client] [k] rnbd_queue_rq
1.51% fio [rtrs_client] [k] rtrs_clt_put_permit
1.47% fio [kernel.kallsyms] [k] try_to_wake_up
1.31% fio [kernel.kallsyms] [k] kmem_cache_alloc
1.22% fio libc-2.28.so [.] 0x00000000000a2547
1.17% fio [mlx4_ib] [k] _mlx4_ib_post_send
1.14% fio [kernel.kallsyms] [k] blkdev_direct_IO
1.14% fio [kernel.kallsyms] [k] read_tsc
1.02% fio [rtrs_client] [k] rtrs_clt_read_req
0.92% fio [rtrs_client] [k] get_next_path_min_inflight
0.92% fio [kernel.kallsyms] [k] sched_clock
0.91% fio [kernel.kallsyms] [k] blk_mq_get_request
0.87% fio [kernel.kallsyms] [k] x86_pmu_enable_all
0.87% fio [kernel.kallsyms] [k] __sched_text_start
0.84% fio [kernel.kallsyms] [k] insert_work
0.82% fio [kernel.kallsyms] [k] copy_user_generic_string
0.80% fio [kernel.kallsyms] [k] blk_attempt_plug_merge
0.73% fio [rtrs_client] [k] rtrs_clt_update_all_stats
I knew somebody would ask for it ;-) No, I didn't because I have been occupied with another task. But I will check it soon in a few weeks. Thank you for the review.quoted