Re: [RFC PATCH v3 4/6] media: tegra: Add Tegra210 Video input driver
From: Sowjanya Komatineni <skomatineni@nvidia.com>
Date: 2020-03-18 17:16:08
Also in:
linux-clk, linux-media, linux-tegra, lkml
On 3/18/20 9:25 AM, Sowjanya Komatineni wrote:
On 3/18/20 9:14 AM, Sowjanya Komatineni wrote:quoted
On 3/18/20 4:48 AM, Hans Verkuil wrote:quoted
External email: Use caution opening links or attachments On 2/24/20 5:45 AM, Sowjanya Komatineni wrote:quoted
On 2/20/20 11:11 AM, Sowjanya Komatineni wrote:quoted
On 2/20/20 5:33 AM, Hans Verkuil wrote:quoted
External email: Use caution opening links or attachments (Replying to myself so I can explain this a bit more) On 2/20/20 1:44 PM, Hans Verkuil wrote:quoted
quoted
+ +static int tegra_csi_tpg_channels_alloc(struct tegra_csi *csi) +{ + struct device_node *node = csi->dev->of_node; + unsigned int port_num; + int ret; + struct tegra_csi_channel *item; + unsigned int tpg_channels = csi->soc->csi_max_channels; + + /* allocate CSI channel for each CSI x2 ports */ + for (port_num = 0; port_num < tpg_channels; port_num++) { + item = devm_kzalloc(csi->dev, sizeof(*item), GFP_KERNEL);Using devm_*alloc can be dangerous. If someone unbinds the driver, then all memory allocated with devm_ is immediately freed. But if an application still has a filehandle open, then when it closes it it might still reference this already-freed memory. I recommend that you avoid using devm_*alloc for media drivers.A good test is to unbind & bind the driver: cd /sys/devices/platform/50000000.host1x/54080000.vi/driver echo -n 54080000.vi >unbind echo -n 54080000.vi >bind First just do this without the driver being used. That already gives me 'list_del corruption' kernel messages (list debugging is turned on in my kernel).Will fix in v4 to use kzalloc and also proper release v4l2 to make sure unbind/bind works properly. BTW, tegra vi and csi are registered as clients to host1x video driver. So, unbind and bind should be done with host1x video driver "tegra-video" cd /sys/devices/platform/50000000.host1x/tegra-video/driver echo -n tegra-video > unbind echo -n tegra-video > bindThis still crashes with v4, at least if I am streaming with v4l2-ctl --stream-mmap. Is that known? It's not a big deal at this moment, just want to know if this will be looked at later. Regards, HansWeird, I tested streaming after unbind and bind as well and don't see crash. Did below steps and tried several times unbind/bind as well. ./v4l2-ctl --stream-mmap --stream-count=1 -d /dev/video3 cd /sys/devices/platform/50000000.host1x/tegra-video/driver echo -n tegra-video > unbind sleep 1 echo -n tegra-video > bind cd /home/ubuntu ./v4l2-ctl --stream-mmap --stream-count=1 -d /dev/video3 Can you post call trace when you saw crash?Tried unbind when node is open with v4l2-ctl --sleep 10 as well and bind back. I don't see crash. Will confirm on doing unbind/bind with stream-mmap...
Able to repro when unbind/bind happens while stream-mmap. Will look and have fix in v5. Thanks Hans.
quoted
quoted
quoted
quoted
quoted
Note that this first test is basically identical to a rmmod/modprobe of the driver. But when I compiled the driver as a module it didn't create any video device nodes! Nor did I see any errors in the kernel log. I didn't pursue this, and perhaps I did something wrong, but it's worth taking a look at. The next step would be to have a video node open with: v4l2-ctl --sleep 10 then while it is sleeping unbind the driver and see what happens when v4l2-ctl exits. Worst case is when you are streaming: v4l2-ctl --stream-mmap and then unbind. In general, the best way to get this to work correctly is: 1) don't use devm_*alloc 2) set the release callback of struct v4l2_device and do all freeing there. 3) in the platform remove() callback you call media_device_unregister() and video_unregister_device().Reg 3, in current patch, media_device_unregister is called in host1x_video_remove video_unregister_device happens during host1x_video_remove -> host1x_device_exit -> tegra_vi_exit -> tegra_vi_channels_cleanupquoted
It's worth getting this right in this early stage, rather than fixing it in the future. Regards, Hans