Thread (10 messages) 10 messages, 5 authors, 2003-03-24

Re: 2.5.65-mm4

From: Martin J. Bligh <hidden>
Date: 2003-03-24 02:53:24
Also in: lkml

. Several ext3 speedups here.  They reduce the overhead of a write() to
  ext3 by about 45%.

. Large locking changes to ext3.  lock_kernel() has been completely
  removed from ext3 and pushed down into the JBD layer, around those bits
  which actually need it.

  Lock contention is greatly reduced, but this change means that the
  front-line locking for ext3 is now two semaphores.  The context switch
rate   under load has gone through the roof.  So there is more work to be
done   here yet.
Well, it shook things up a bit, but doesn't seem to have much effect for
the workload I was looking at, at least:

DISCLAIMER: SPEC(tm) and the benchmark name SDET(tm) are registered
trademarks of the Standard Performance Evaluation Corporation. This 
benchmarking was performed for research purposes only, and the run results
are non-compliant and not-comparable with any published results.

Results are shown as percentages of the first set displayed

SDET 1  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         2.0%
               2.5.65-mm4        98.9%         1.8%
          2.5.65-mm4-ext3        90.4%         4.1%

SDET 2  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         2.3%
               2.5.65-mm4        98.3%         3.6%
          2.5.65-mm4-ext3        93.4%         3.1%

SDET 4  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         1.3%
               2.5.65-mm4        97.8%         0.3%
          2.5.65-mm4-ext3        45.5%         7.1%

SDET 8  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         1.0%
               2.5.65-mm4        98.7%         1.5%
          2.5.65-mm4-ext3        13.7%         3.3%

SDET 16  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         1.0%
               2.5.65-mm4        55.2%        57.2%
          2.5.65-mm4-ext3         8.4%         2.5%

SDET 32  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         0.5%
               2.5.65-mm4        98.2%         0.6%
          2.5.65-mm4-ext3         8.5%         3.7%

SDET 64  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         0.4%
               2.5.65-mm4        97.7%         0.4%
          2.5.65-mm4-ext3         7.9%         2.2%

SDET 128  (see disclaimer)
                           Throughput    Std. Dev
               2.5.65-mm3       100.0%         0.6%
               2.5.65-mm4        98.0%         0.4%
          2.5.65-mm4-ext3         8.3%         1.2%

profile from SDET 64:

82303 __down
42835 schedule
31323 __wake_up
26435 .text.lock.sched
15924 .text.lock.transaction
6470 do_get_write_access
5106 zap_pte_range
4693 copy_page_range
4522 journal_add_journal_head
4491 __blk_queue_bounce
4179 page_remove_rmap
3859 find_get_page
3687 journal_get_write_access
2949 journal_dirty_metadata
2769 cpu_idle
2691 d_lookup
2495 start_this_handle
2220 __copy_to_user_ll
2199 do_anonymous_page
2168 __find_get_block
2069 page_add_rmap
2063 __find_get_block_slow
1842 .text.lock.attr
1650 do_wp_page
1603 ext3_get_inode_loc
1600 release_pages
1405 journal_stop
1237 find_next_usable_block
1233 do_no_page
1224 current_kernel_time
1203 __brelse
1143 ext3_do_update_inode
1083 kmem_cache_free
1030 kmap_atomic

diffprofile with a spinlined version 
(still on ext3, should just show who took the locks).

     20618    48.1% schedule
      6211    96.0% do_get_write_access
      4485   609.4% journal_start
      3559   253.3% journal_stop
      1554   582.0% inode_change_ok
       605  1680.6% sem_exit
       589    13.0% journal_add_journal_head
       421   825.5% proc_pid_readlink
       301    13.9% __find_get_block
       246  1366.7% sys_ioctl
       186    15.0% find_next_usable_block
       139   195.8% inode_setattr
       122    13.2% atomic_dec_and_lock
...
      -101    -9.3% kmem_cache_free
      -103   -34.0% journal_forget
      -106   -20.7% __make_request
      -117  -100.0% .text.lock.root
      -118    -5.7% page_add_rmap
      -123   -15.2% free_hot_cold_page
      -126    -7.6% do_wp_page
      -127    -3.0% page_remove_rmap
      -130   -17.2% buffered_rmqueue
      -133    -5.3% start_this_handle
      -149   -24.8% scsi_queue_next_request
      -181  -100.0% .text.lock.dec_and_lock
      -209   -10.1% __find_get_block_slow
      -210    -5.7% journal_get_write_access
      -269   -12.1% __copy_to_user_ll
      -318  -100.0% .text.lock.ioctl
      -383    -8.5% __blk_queue_bounce
      -438  -100.0% .text.lock.base
      -736  -100.0% .text.lock.sem
      -903  -100.0% .text.lock.journal
     -1008    -3.2% __wake_up
     -1842  -100.0% .text.lock.attr
     -3472    -4.2% __down
    -14325    -0.6% default_idle
    -15908   -99.9% .text.lock.transaction
    -26435  -100.0% .text.lock.sched
    -30643    -1.2% total

I'll need to put something else together for the semaphores, unless
you already know who's taking them ...

Just for reference, this is the profile from the -mjb1 run with ext3 I did:

22660 .text.lock.inode
2888 .text.lock.namei
2570 .text.lock.sched
2424 .text.lock.attr
860 ext3_prepare_write
498 unmap_all_pages
468 .text.lock.dir
464 ext3_commit_write
448 page_remove_rmap
427 copy_page_range
411 .text.lock.sem
356 schedule
349 __down
302 page_add_rmap
252 find_get_page
252 .text.lock.base
246 d_lookup
225 .text.lock.ioctl
209 __copy_to_user_ll
199 inode_change_ok
196 journal_add_journal_head
187 __wake_up
176 start_this_handle
176 __blk_queue_bounce
175 ext3_setattr
170 do_anonymous_page
156 do_wp_page
145 ext3_dirty_inode
127 __find_get_block
122 ext3_get_block_handle
118 find_next_usable_block
118 do_get_write_access
117 do_no_page
111 ext3_get_inode_loc
107 kmap_atomic
106 do_page_fault
104 __block_prepare_write
101 pte_alloc_one

M.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help