Re: [PATCH 03/13] dmaengine: up-level reference counting to the module level
From: Dan Williams <hidden>
Date: 2008-12-15 22:12:34
Also in:
lkml
On Fri, Dec 12, 2008 at 7:28 AM, Sosnowski, Maciej [off-list ref] wrote:
Dan, General concern I see about this change is that it makes impossible to rmmod ioatdma in case of NET_DMA enabled. This is a specific case of situation when client is permanently registered in dmaengine, making it impossible to rmmod any device with "public" channels. Ioatdma is just an example here. However in ioatdma case it would be problematic now to switch between CPU and DMA copy modes. It seems that the only way to disable DMA after loading ioatdma would be raising tcp_dma_copybreak to some high enough value to direct all buffers to CPU copy path. This way is yet rather more like hacking than "normal" usage way (like "rmmod ioatdma" used so far).
Hi Maciej, I have been waiting for you to point this out because I believe it shows where more work is needed in net_dma. The problem with net_dma is that it never says "I am done with dma channels". The result was the old/complicated per-operation reference counting in dmaengine that lead to the, now deleted, comment for ioat_remove(): /* * It is unsafe to remove this module: if removed while a requested * dma is outstanding, esp. from tcp, it is possible to hang while * waiting for something that will never finish. However, if you're * feeling lucky, this usually works just fine. */ This also required the complexity of each client needing to handle asynchronous channel removal events. In all other cases in the kernel a module is pinned as long as it has users, so dmaengine was backwards in this regard. Putting the end-node principal to work here means that the complexity/effort of determining when net_dma is done with channels should reside in net_dma, not dmaengine. Since we cannot hook into a "rmmod net" event to drop the dmaengine reference we will need some out-of-band signal to enable "rmmod ioatdma" like detecting network-idle, or having an explicit "dma disable" sysctl.
Another issue I find problematic in this solution is that _any_ client declaring its presence in dmaengine causes holding reference for _all_ channels (and blocking them), does not matter if they are used or not. I agree with Guennadi's doubts here. Why not at least hold a reference only for channels with capabilities matching client's ones?
All channels share the same parent module, so taking a reference on one will always involve blocking all channels. Private channels have this granularity, and this is a primary difference between public and private channels. I eventually want dmaengine to hook into cpu hotplug notifications to guarantee that resources are only allocated for channels with a non-zero ->table_count, but this is not there today.
quoted
@@ -105,19 +106,8 @@ static ssize_t show_bytes_transferred(struct device *dev, struct device_attribut static ssize_t show_in_use(struct device *dev, struct device_attribute *attr, char *buf) { struct dma_chan *chan = to_dma_chan(dev); - int in_use = 0; - - if (unlikely(chan->slow_ref) && - atomic_read(&chan->refcount.refcount) > 1) - in_use = 1; - else { - if (local_read(&(per_cpu_ptr(chan->local, - get_cpu())->refcount)) > 0) - in_use = 1; - put_cpu(); - } - return sprintf(buf, "%d\n", in_use); + return sprintf(buf, "%d\n", chan->client_count); }In this case show_in_use will not show if the channel is really in use but rather how many clients are present, including these with different capabilities. Thus this number does not even show number of "potential" users of the channel... Again, maybe it would be better to limit chan->client_count to number of at least potential users ( = matching capabilities)?
To be clear show_in_use was broken before because it only looked at a per-cpu variable[1], the situation is better now because it indeed shows "potential" users in the public case and actual users in the private case. I.e. if a public channel client has the potential to get this channel via dma_find_channel() then that will be reflected in ->client_count.
quoted
/* Find a channel */@@ -178,23 +228,16 @@ static void dma_client_chan_alloc(struct dma_client *client) list_for_each_entry(chan, &device->channels, device_node) { if (!dma_chan_satisfies_mask(chan, client->cap_mask)) continue; + if (!chan->client_count) + continue;As cap_mask is per device not per channel, checking capabilites matching is not necessary to be repeated for every channel. It would be more efficient to do it once yet before list_for_each_entry(chan, &device->channels, device_node).
This is changed in later patches [2]. We repeat for each channel in the dma_request_channel() case under the assumption that the client may be smart enough to disambiguate the channels on other criteria.
quoted
@@ -420,6 +443,19 @@ int dma_async_device_register(struct dma_device *device) } mutex_lock(&dma_list_mutex); + list_for_each_entry(chan, &device->channels, device_node) { + /* if clients are already waiting for channels we need to + * take references on their behalf + */ + if (dmaengine_ref_count && dma_chan_get(chan) == -ENODEV) { + /* note we can only get here for the first + * channel as the remaining channels are + * guaranteed to get a reference + */ + rc = -ENODEV; + goto err_out; + } + }Going through this list_for_each_entry() loop makes sense only if there are any clients, so maybe it would be more efficient to move "if (dmaengine_ref_count)" check before list_for_each_entry(chan, &device->channels, device_node)?
Good point... will fix. Thanks, Dan [1] http://marc.info/?l=linux-kernel&m=121448921721509&w=2 [2] http://marc.info/?l=linux-kernel&m=122835195303216&w=2