Thread (52 messages) 52 messages, 6 authors, 2009-02-06

Re: [PATCH 03/13] dmaengine: up-level reference counting to the module level

From: Dan Williams <hidden>
Date: 2008-12-15 22:12:34
Also in: lkml

On Fri, Dec 12, 2008 at 7:28 AM, Sosnowski, Maciej
[off-list ref] wrote:
Dan,

General concern I see about this change is that it makes impossible
to rmmod ioatdma in case of NET_DMA enabled.
This is a specific case of situation when client is permanently registered in dmaengine,
making it impossible to rmmod any device with "public" channels.
Ioatdma is just an example here.
However in ioatdma case it would be problematic now to switch between CPU and DMA copy modes.
It seems that the only way to disable DMA after loading ioatdma would be raising tcp_dma_copybreak
to some high enough value to direct all buffers to CPU copy path.
This way is yet rather more like hacking than "normal" usage way (like "rmmod ioatdma" used so far).
Hi Maciej,

I have been waiting for you to point this out because I believe it
shows where more work is needed in net_dma.  The problem with net_dma
is that it never says "I am done with dma channels".  The result was
the old/complicated per-operation reference counting in dmaengine that
lead to the, now deleted, comment for ioat_remove():

/*
 * It is unsafe to remove this module: if removed while a requested
 * dma is outstanding, esp. from tcp, it is possible to hang while
 * waiting for something that will never finish.  However, if you're
 * feeling lucky, this usually works just fine.
 */

This also required the complexity of each client needing to handle
asynchronous channel removal events.  In all other cases in the kernel
a module is pinned as long as it has users, so dmaengine was backwards
in this regard.

Putting the end-node principal to work here means that the
complexity/effort of determining when net_dma is done with channels
should reside in net_dma, not dmaengine.  Since we cannot hook into a
"rmmod net" event to drop the dmaengine reference we will need some
out-of-band signal to enable "rmmod ioatdma" like detecting
network-idle, or having an explicit "dma disable" sysctl.
Another issue I find problematic in this solution is that _any_ client
declaring its presence in dmaengine causes holding reference for _all_ channels (and blocking them),
does not matter if they are used or not.
I agree with Guennadi's doubts here.
Why not at least hold a reference only for channels with capabilities matching client's ones?
All channels share the same parent module, so taking a reference on
one will always involve blocking all channels.  Private channels have
this granularity, and this is a primary difference between public and
private channels.  I eventually want dmaengine to hook into cpu
hotplug notifications to guarantee that resources are only allocated
for channels with a non-zero ->table_count, but this is not there
today.
quoted
@@ -105,19 +106,8 @@ static ssize_t show_bytes_transferred(struct
 device *dev, struct device_attribut static ssize_t
 show_in_use(struct device *dev, struct device_attribute *attr, char
      *buf) { struct dma_chan *chan = to_dma_chan(dev);
-     int in_use = 0;
-
-     if (unlikely(chan->slow_ref) &&
-             atomic_read(&chan->refcount.refcount) > 1)
-             in_use = 1;
-     else {
-             if (local_read(&(per_cpu_ptr(chan->local,
-                     get_cpu())->refcount)) > 0)
-                     in_use = 1;
-             put_cpu();
-     }

-     return sprintf(buf, "%d\n", in_use);
+     return sprintf(buf, "%d\n", chan->client_count);
 }
In this case show_in_use will not show if the channel is really in use
but rather how many clients are present, including these with different capabilities.
Thus this number does not even show number of "potential" users of the channel...
Again, maybe it would be better to limit chan->client_count
to number of at least potential users ( = matching capabilities)?
To be clear show_in_use was broken before because it only looked at a
per-cpu variable[1], the situation is better now because it indeed
shows "potential" users in the public case and actual users in the
private case.  I.e. if a public channel client has the potential to
get this channel via dma_find_channel() then that will be reflected in
->client_count.
quoted
      /* Find a channel */
@@ -178,23 +228,16 @@ static void dma_client_chan_alloc(struct
              dma_client *client)
              list_for_each_entry(chan, &device->channels, device_node) {
                      if (!dma_chan_satisfies_mask(chan, client->cap_mask))
                              continue;
+                     if (!chan->client_count)
+                             continue;
As cap_mask is per device not per channel, checking capabilites matching
is not necessary to be repeated for every channel.
It would be more efficient to do it once yet before
list_for_each_entry(chan, &device->channels, device_node).
This is changed in later patches [2].  We repeat for each channel in
the dma_request_channel() case under the assumption that the client
may be smart enough to disambiguate the channels on other criteria.
quoted
@@ -420,6 +443,19 @@ int dma_async_device_register(struct dma_device
      *device) }

      mutex_lock(&dma_list_mutex);
+     list_for_each_entry(chan, &device->channels, device_node) {
+             /* if clients are already waiting for channels we need to
+              * take references on their behalf
+              */
+             if (dmaengine_ref_count && dma_chan_get(chan) == -ENODEV) {
+                     /* note we can only get here for the first
+                      * channel as the remaining channels are
+                      * guaranteed to get a reference
+                      */
+                     rc = -ENODEV;
+                     goto err_out;
+             }
+     }
Going through this list_for_each_entry() loop makes sense only if there are any clients,
so maybe it would be more efficient to move "if (dmaengine_ref_count)" check before
list_for_each_entry(chan, &device->channels, device_node)?
Good point... will fix.

Thanks,
Dan

[1] http://marc.info/?l=linux-kernel&m=121448921721509&w=2
[2] http://marc.info/?l=linux-kernel&m=122835195303216&w=2
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help