Re: [md PATCH 2/5] md: Enable reshape for external metadata
From: Neil Brown <hidden>
Date: 2010-06-17 10:35:19
On Thu, 17 Jun 2010 10:40:36 +0100 "Trela, Maciej" [off-list ref] wrote:
quoted
quoted
Another thing is waiting during reshape for metadata update onMD_CHANGE_DEVS flag.quoted
To roll reshape I've added the following code (instead callingmd_ubdate_sb()): Yes, there is a real issue there... I don't think we ever need the kernel to wait for an external metadata handler to respond to device changes (apart from failure which is handled separately). So maybe the best thing is to guard all settings of MD_CHANGE_DEVS with if (mddev->persistent) I think that would be best, but I've make a note to review that later.Neil, from what I see in the raid5.c/md.c "native" code uses MD_CHANGE_DEVS during the reshape if it reaches special points when metadata write is really needed to update the reshape checkpoint. In reshape_request(): /* Cannot proceed until we've updated the superblock */ .. set_bit(MD_CHANGE_DEVS, mddev->flags) In md_check_recovery() we have: if (mddev->flags) md_update_sb() Couldn't we follow this logic with MD_CHANGE_DEVS for external metadata? If not, how to detect the need for migration checkpoint update?
Good question. The first question to ask is How does mdmon know when a metadata update is required, and how does it tell md that the metadata update is complete. OK, 2 first questions... For the first I suspect it should watch 'md/reshape_position' (which need to use sysfs_notify for). For the second .... I don't know. - Maybe sync_action could change to 'paused' and mdmon writes 'continue'.... but that is possibly overloading that file too much. - We could have a new sysfs file which just shows paused/active ?? - We could require that mdmon sets 'sync_max' appropriately so that reshape will stop at the right place, and then when mdmon has updated the metadata, it sets a new sync_max value. - As above, but if sync_max is set too high, it is automatically reduced to the place when raid5 finds that it has to stop I think the last one is probably best. Before updating ->reshape_position, raid5 checks ->resync_max and if it is too high for safety it set is lower to a safer value. Then it changes ->reshape_position and calls sysfs_notify. mdmon watches for 'reshape_postion' to change. when it does it updates the metadata and then writes a larger value to ->resync_max. Things can get a little confusing when reshaping to fewer devices as reshape_position decreases, but sync_completed always increases and sync_max is still an 'upper' limit. But it should work OK. Does that seem reasonable? NeilBrown