Thread (13 messages) 13 messages, 4 authors, 2017-08-14

RE: quota: dqio_mutex design

From: Wang Shilong <hidden>
Date: 2017-08-14 03:29:31
Also in: linux-fsdevel

Hello Jan,

   We have tested your patches, in generally, it helped in our case. Noticed,
our test case is only one user with many process create/remove file.


	4.13.0-rc3 without any patches					
	no Quota		-O quota'		-O quota, project'	
	File Creation	File Unlink	File Creation	File Unlink	File Creation	File Unlink
0	93,068 	         296,028 	            86,860 	285,131 	           85,199 	        189,653 
1	79,501 	         280,921 	            91,079 	277,349 	          186,279 	170,982 
2	79,932 	         299,750 	            90,246 	274,457 	           133,922 	191,677 
3	80,146 	         297,525 	            86,416 	272,160 	           192,354 	198,869 

	4.13.0-rc3/w Jan Kara patch					
	no Quota		-O quota'		-O quota, project'	
	File Creation	File Unlink	File Creation	File Unlink	File Creation	File Unlink  
0	73,057 	        311,217 	         74,898 	          286,120 	 81,217 	        288,138  ops/per second
1	78,872             312,471 	         76,470 	          277,033 	 77,014 	        288,057 
2	79,170             291,440 	         76,174              283,525 	 73,686            283,526 
3	79,941             309,168            78,493              277,331          78,751            281,377 

	4.13.0-rc3/with https://patchwork.ozlabs.org/patch/799014/				
	no Quota		-O quota'		-O quota, project'	
	File Creation	File Unlink	File Creation	File Unlink	File Creation	File Unlink
0	100,319 	        322,746 	         87,480 	         302,579 	         84,569 	        218,969 
1	728,424 	        299,808            312,766           293,471           219,198          199,389 
2	729,410           300,930            315,590           289,664           218,283          197,871 
3	727,555           298,797 	         316,837 	         289,108           213,095          213,458 
						
	4.13.0-rc3/w https://patchwork.ozlabs.org/patch/799014/ + Jan Kara patch					
	no Quota		-O quota'		-O quota, project'	
	File Creation	File Unlink	File Creation	File Unlink	File Creation	File Unlink
0	100,312 	         324,871 	         87,076 	         267,303 	          86,258 	        288,137 
1	707,524 	         298,892           361,963           252,493            421,919 	282,492 
2	707,792            298,162           363,450           264,923            397,723 	283,675 
3	707,420            302,552 	         354,013 	         266,638 	          421,537 	281,763 


In conclusion, your patches helped a lot for our testing, noticed, please ignored test0 running
for creation, the first time testing will loaded inode cache in memory, we used test1-3 to compare.

With extra patch applied, your patches improved File creation(quota+project) 2X, File unlink
1.5X.

Thanks,
Shilong

________________________________________
From: Jan Kara [jack@suse.cz]
Sent: Wednesday, August 09, 2017 0:06
To: Wang Shilong
Cc: Jan Kara; Andrew Perepechko; Shuichi Ihara; Wang Shilong; Li Xi; Ext4 Developers List; linux-fsdevel@vger.kernel.org
Subject: Re: quota: dqio_mutex design

Hi,

On Thu 03-08-17 22:39:51, Wang Shilong wrote:
Please send me patches, we could test and response you!
So I finally have something which isn't obviously wrong (it survives basic
testing and gives me improvements for some workloads). I have pushed out
the patches to:

git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git quota_scaling

I'd be happy if you can share your results with my patches. I have not yet
figured out a safe way to reduce the contention on dq_lock during update of
on-disk structure when lot of processes bang single dquot. I have
experimental patch but it didn't bring any benefit in my testing - I'll
rebase it on top of other patches I have send it to you for some testing.

                                                                Honza
On Thu, Aug 3, 2017 at 10:36 PM, Jan Kara [off-list ref] wrote:
quoted
Hello!

On Thu 03-08-17 19:31:04, Wang Shilong wrote:
quoted
We DDN is investigating the same issue!

Some comments comes:

On Thu, Aug 3, 2017 at 1:52 AM, Andrew Perepechko [off-list ref] wrote:
quoted
quoted
On Tue 01-08-17 15:02:42, Jan Kara wrote:
quoted
Hi Andrew,
I've been experimenting with this today but this idea didn't bring any
benefit in my testing. Was your setup with multiple users or a single user?
Could you give some testing to my patches to see whether they bring some
benefit to you?

                                                              Honza
Hi Jan!

My setup was with a single user. Unfortunately, it may take some time before
I can try a patched kernel other than RHEL6 or RHEL7 with the same test,
we have a lot of dependencies on these kernels.

The actual test we ran was mdtest.

By the way, we had 15+% performance improvement in creates from the
change that was discussed earlier in this thread:

           EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA]) {
+              if (test_bit(DQ_MOD_B, &dquot->dq_flags))
+                       return 0;
I don't think this is right, as far as i understand, journal quota need go
together with quota space change update inside same transaction, this will
break consistency if power off or RO happen.

Here is some ideas that i have thought:

1) switch dqio_mutex to a read/write lock, especially, i think most of
time journal quota updates is in-place update, that means we don't need
change quota tree in memory, firstly try read lock, retry with write lock if
there is real tree change.

2)another is similar idea of Andrew's walkaround, but we need make correct
fix, maintain dirty list for per transaction, and gurantee quota updates are
flushed when commit transaction, this might be complex, i am not very
familiar with JBD2 codes.

It will be really nice if we could fix this regression, as we see 20% performace
regression.
So I have couple of patches:

1) I convert dqio_mutex do rw semaphore and use it in exclusive mode only
when quota tree is going to change. We also use dq_lock to serialize writes
of dquot - you cannot have two writes happening in parallel as that could
result in stale data being on disk. This patch brings benefit when there
are multiple users - now they don't contend on common lock. It shows
advantage in my testing so I plan to merge these patches. When the
contention is on a structure for single user this change however doesn't
bring much (the performance change is in statistical noise in my testing).

2) I have patches to remove some contention on dq_list_lock by not using
dirty list for tracking dquots in ext4 (and thus avoid dq_list_lock
completely in quota modification path). This does not bring measurable
benefit in my testing even on ramdisk but lockstat data for dq_list_lock
looks much better after this - it seems lock contention just shifted to
dq_data_lock - I'll try to address that as well and see whether I'll be
able to measure some advantage.

3) I have patches to convert dquot dirty bit to sequence counter so that
in commit_dqblk() we can check whether dquot state we wanted to write is
already on disk. Note that this is different from Andrew's approach in that
we do wait for dquot to be actually written before returning. We just don't
repeat the write unnecessarily. However this didn't bring any measurable
benefit in my testing so unless I'll be able to confirm it benefits some
workloads I won't merge this change.

If you can experiment with your workloads, I can send you patches. I'd be
keen on having some performance data from real setups...

                                                                Honza
quoted
Thanks,
Shilong
quoted
               dquot_mark_dquot_dirty(dquot);
               return ext4_write_dquot(dquot);

The idea was that if we know that some thread is somewhere between
mark_dirty and clear_dirty, then we can avoid blocking on dqio_mutex,
since that thread will update the ondisk dquot for us.
--
Jan Kara [off-list ref]
SUSE Labs, CR
--
Jan Kara [off-list ref]
SUSE Labs, CR
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help