On Wed, Feb 26, 2014 at 11:33:13AM +0100, Ales Novak wrote:
In many rtc modules, the chardevice file in rtc module probe is
being created prematurely. If the probe fails after the chardevice
file has been created (e.g. after rtc_device_register), it's possible
for a program to open() it, which subsequently can cause memory
corruption.
The race looks like that (thanks Jiri):
CPU0: CPU1:
sys_load_module()
do_init_module()
do_one_initcall()
cmos_do_probe()
rtc_device_register()
__register_chrdev()
cdev->owner = struct module*
open("/dev/rtc0")
rtc_device_unregister()
module_put()
free_module()
module_free(mod->module_core)
/* struct module *module is now
freed */
chrdev_open()
spin_lock(cdev_lock)
cdev_get()
try_module_get()
module_is_live()
/* dereferences already
freed struct module* */
[Context: For a patch to rtc-pcf2127.c Alexandre Belloni asked not to
fail after rtc_device_register successfully finished and pointed to this
reasoning as explaination.]
If there is really such a race then (I hope) there is
something in the cdev code that needs fixing. According to my
understanding, when rtc_device_unregister returned, the cdev is gone and
so chrdev_open is supposed to fail.
Otherwise the same problem can be triggered when the device is unbound
and the module unloaded while another cpu opens the char dev.
I added a msleep(5000) to chrdev_open like this:
if (current->pid != 1) {
pr_info("%s:%d: sleeping to open race window\n", __func__, __LINE__);
msleep(5000);
pr_info("%s:%d: done sleeping\n", __func__, __LINE__);
}
before the spinlock is taken. If I trigger that (using
a = open("/dev/rtc0")
in a Python shell) and then do rmmod rtc_pcf2127 (which is the driver
backing my rtc0), I get:
OSError: [Errno 6] No such device or address: '/dev/rtc0'
This patch is proposing a solution, splitting the function
{devm_,}rtc_device_register into {devm_,}rtc_device_register{_fs,}.
The {devm_}rtc_device_register_fs which is creating the files, should
be called after it is clear that the probe will pass. It will set the
RTC_DEV_FILES_EXIST into rtc_device->flags.
So this split is not a complete fix but only a work around for some
cases at best.
Maybe there isn't even a problem?
Best regards
Uwe