Thread (9 messages) 9 messages, 2 authors, 2017-05-31

Re: [mdadm PATCH] Create: move STOP_ARRAY to abort_locked

From: Zhilong Liu <hidden>
Date: 2017-05-31 10:25:54
Subsystem: the rest · Maintainer: Linus Torvalds


On 05/09/2017 01:54 AM, Jes Sorensen wrote:
On 05/07/2017 09:50 PM, Zhilong Liu wrote:
quoted

On 05/05/2017 11:31 AM, Liu Zhilong wrote:
quoted

On 05/04/2017 10:54 PM, Jes Sorensen wrote:
quoted
On 05/04/2017 08:20 AM, Zhilong Liu wrote:
quoted
Hi Jes,

apply for review, this is a bug I ever encountered.
Zhilong,

Under what circumstances do you see this?
Issued the command:
linux-g0sr:/home/test # ./mdadm -CR /dev/md0 -l1 -n2 -b internal
/dev/loop[0-1] --size 63
... ... ...
mdadm: Given bitmap chunk size not supported.
linux-g0sr:/home/test # ls /dev/md0
/dev/md0
linux-g0sr:/home/test # ls /sys/block/md0/md/
array_size   bitmap      component_size  level metadata_version
raid_disks         reshape_position safe_mode_delay
array_state  chunk_size  layout          max_read_errors
new_dev           reshape_direction  resync_start

create_mddev() writes the devnm to
/sys/module/md_mod/parameter/new_array,
then in md.c, module_param_call() called the 'set' function of
add_named_array(),
md_alloc() init_and_add the kobject for devm, finally the devnm device
has created
and sysfs has registered after create_mddev executed successfully.
Thus it's better
to STOP_ARRAY in any case after create_mddev() invoked.
this patch depends on the kernel commit:
039b7225e6e9 ("md: allow creation of mdNNN arrays via
md_mod/parameters/new_array")
Neil's patch has set "mddev->hold_active = UNTIL_STOP", thus the
STOP_ARRAY can work
well on this situation.
OK now I am confused - are you saying this change will only work after 
Neil's kernel patch has been applied? That would be no good, mdadm 
needs to work on older kernels too.
Sorry for the late reply for this.

Currently, the creating array method for later kernel(newer than v4.11), 
it can avoid the race problem
via to set 'create_on_open' as N and write the mddev -> 
/sys/module/md_mod/parameters/new_array,
then mdadm can send the ioctl to stop_array directly if creating array 
fail, it won't happen the race.
refer to commit: 78b6350dcaad ("md: support disabling of create-on-open 
semantics.")

And for older kernel, it's difficult to avoid the problem of the race 
when use 'change', 'add' and 'remove'
udev rules in a short period of time via to ioctl commands.
For example, this case tested on my latest Tumbleweed(20170524) with 
latest mdadm source.

Steps:
First part:
-  open one terminal to monitor the udev:
Terminal 1: # udevadm monitor
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent

-  type the command in another terminal:
Terminal 2: # ./mdadm -CR /dev/md1 --bitmap=internal -l1 -n2 /dev/loop1 
/dev/loop2 --size 63

    add_internal_bitmap received the abnormal chunk_size( < 64k), and 
return a failure.
    but it left the partially created array and didn't cleanup. For this 
test, the udev monitor prints:

Terminal 1: # udevadm monitor
... ...
KERNEL[146.077168] add      /devices/virtual/bdi/9:1 (bdi)
KERNEL[146.077211] add      /devices/virtual/block/md1 (block)
UDEV  [146.078112] add      /devices/virtual/bdi/9:1 (bdi)
UDEV  [146.084163] add      /devices/virtual/block/md1 (block)

Terminal 2: # ./mdadm -S /dev/md1

Terminal 1: # udevadm monitor
KERNEL[153.276209] remove   /devices/virtual/bdi/9:1 (bdi)
KERNEL[153.276317] remove   /devices/virtual/block/md1 (block)
UDEV  [153.277034] remove   /devices/virtual/bdi/9:1 (bdi)
UDEV  [153.280801] remove   /devices/virtual/block/md1 (block)

Second part:
add the ioctl(stop_array) and compiled to monitor the udevs.
# git diff
diff --git a/Create.c b/Create.c
index 239545f..21568ca 100644
--- a/Create.c
+++ b/Create.c
@@ -1065,7 +1065,9 @@ int Create(struct supertype *st, char *mddev,
         map_remove(&map, fd2devnm(mdfd));
         map_unlock(&map);

-       if (mdfd >= 0)
+       if (mdfd >= 0) {
+               ioctl(mdfd, STOP_ARRAY, NULL);
                 close(mdfd);
+       }
         return 1;
  }
Terminal 2: # ./mdadm -CR /dev/md1 --bitmap=internal -l1 -n2 /dev/loop1 
/dev/loop2 --size 63

Terminal 1: # udevadm monitor
... ...
KERNEL[171.964597] add      /devices/virtual/bdi/9:1 (bdi)
KERNEL[171.965346] add      /devices/virtual/block/md1 (block)
UDEV  [171.965565] add      /devices/virtual/bdi/9:1 (bdi)
KERNEL[171.984195] remove   /devices/virtual/bdi/9:1 (bdi)
KERNEL[171.984555] remove   /devices/virtual/block/md1 (block)
UDEV  [171.984590] remove   /devices/virtual/bdi/9:1 (bdi)
KERNEL[172.004504] add      /devices/virtual/bdi/9:1 (bdi)
KERNEL[172.004890] add      /devices/virtual/block/md1 (block)
UDEV  [172.004923] add      /devices/virtual/bdi/9:1 (bdi)
UDEV  [172.005787] add      /devices/virtual/block/md1 (block)
UDEV  [172.009648] remove   /devices/virtual/block/md1 (block)
UDEV  [172.013232] add      /devices/virtual/block/md1 (block)

So shall give udev a moment to process 'add' events?

mdadm can work as expect when adding usleep(100 * 1000) before perform 
ioctl(STOP_ARRAY).
this is the udev monitor result tested by the following sample.

Terminal 1: # udevadm monitor
KERNEL[5476.780692] add      /devices/virtual/bdi/9:1 (bdi)
KERNEL[5476.780976] add      /devices/virtual/block/md1 (block)
UDEV  [5476.782355] add      /devices/virtual/bdi/9:1 (bdi)
UDEV  [5476.786056] add      /devices/virtual/block/md1 (block)
KERNEL[5476.896255] remove   /devices/virtual/bdi/9:1 (bdi)
KERNEL[5476.896367] remove   /devices/virtual/block/md1 (block)
UDEV  [5476.896895] remove   /devices/virtual/bdi/9:1 (bdi)
UDEV  [5476.900752] remove   /devices/virtual/block/md1 (block)


Such as like this:
diff --git a/Create.c b/Create.c
index 239545f..a07ace8 100644
--- a/Create.c
+++ b/Create.c
@@ -902,10 +902,8 @@ int Create(struct supertype *st, char *mddev,
                                         remove_partitions(fd);
                                 if (st->ss->add_to_super(st, &inf->disk,
                                                          fd, dv->devname,
- dv->data_offset)) {
-                                       ioctl(mdfd, STOP_ARRAY, NULL);
+ dv->data_offset))
                                         goto abort_locked;
-                               }
                                 st->ss->getinfo_super(st, inf, NULL);
                                 safe_mode_delay = inf->safe_mode_delay;
@@ -1006,7 +1004,6 @@ int Create(struct supertype *st, char *mddev,
                         sysfs_set_safemode(&info, safe_mode_delay);
                         if (err) {
                                 pr_err("failed to activate array.\n");
-                               ioctl(mdfd, STOP_ARRAY, NULL);
                                 goto abort;
                         }
                 } else if (c->readonly &&
@@ -1016,7 +1013,6 @@ int Create(struct supertype *st, char *mddev,
                                           "array_state", "readonly") < 0) {
                                 pr_err("Failed to start array: %s\n",
                                        strerror(errno));
-                               ioctl(mdfd, STOP_ARRAY, NULL);
                                 goto abort;
                         }
                 } else {
@@ -1028,7 +1024,6 @@ int Create(struct supertype *st, char *mddev,
                                 if (info.array.chunk_size & 
(info.array.chunk_size-1)) {
                                         cont_err("Problem may be that 
chunk size is not a power of 2\n");
                                 }
-                               ioctl(mdfd, STOP_ARRAY, NULL);
                                 goto abort;
                         }
                         /* if start_ro module parameter is set, array is
@@ -1065,7 +1060,11 @@ int Create(struct supertype *st, char *mddev,
         map_remove(&map, fd2devnm(mdfd));
         map_unlock(&map);

-       if (mdfd >= 0)
+       if (mdfd >= 0) {
+               /* Give udev a moment to finish 'add' events. */
+               usleep(100*1000);
+               ioctl(mdfd, STOP_ARRAY, NULL);
                 close(mdfd);
+       }
         return 1;
  }

Thanks,
-Zhilong
Jes

  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help