Thread (24 messages) 24 messages, 5 authors, 2020-11-20

Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

From: Syed Nayyar Waris <hidden>
Date: 2020-11-09 14:49:04
Also in: linux-gpio, lkml

On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray
[off-list ref] wrote:
On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
quoted
On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
quoted
On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
quoted
On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
[off-list ref] wrote:
quoted
On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
quoted
On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris [off-list ref] wrote:
quoted
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.
This now causes -Wtype-limits warnings in linux-next with gcc-10:
Hi Arnd,

What version of gcc-10 are you running? I'm having trouble generating
these warnings so I suspect I'm using a different version than you.
I originally saw it with the binaries from
https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
also been able to reproduce it with a minimal test case on the
binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
quoted
Let me first verify that I understand the problem correctly. The issue
is the possibility of a stack smash in bitmap_set_value() when the value
of start + nbits is larger than the length of the map bitmap memory
region. This is because index (or index + 1) could be outside the range
of the bitmap memory region passed in as map. Is my understanding
correct here?
Yes, that seems to be the case here.
quoted
In xgpio_set_multiple(), the variables width[0] and width[1] serve as
possible start and nbits values for the bitmap_set_value() calls.
Because width[0] and width[1] are unsigned int variables, GCC considers
the possibility that the value of width[0]/width[1] might exceed the
length of the bitmap memory region named old and thus result in a stack
smash.

I don't know if invalid width values are actually possible for the
Xilinx gpio device, but let's err on the side of safety and assume this
is actually a possibility. We should verify that the combined value of
gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
check for this in xgpio_probe() when we grab the gpio_width values.

However, we're still left with the GCC warnings because GCC is not smart
enough to know that we've already checked the boundary and width[0] and
width[1] are valid values. I suspect we can avoid this warning is we
refactor bitmap_set_value() to increment map seperately and then set it:
As I understand it, part of the problem is that gcc sees the possible
range as being constrained by the operations on 'start' and 'nbits',
in particular the shift in BIT_WORD() that put an upper bound on
the index, but then it sees that the upper bound is higher than the
upper bound of the array, i.e. element zero.

I added a check

      if (start >= 64 || start + size >= 64) return;

in the godbolt.org testcase, which does help limit the start
index appropriately, but it is not sufficient to let the compiler
see that the 'if (space >= nbits) ' condition is guaranteed to
be true for all values here.
quoted
static inline void bitmap_set_value(unsigned long *map,
                                    unsigned long value,
                                    unsigned long start, unsigned long nbits)
{
        const unsigned long offset = start % BITS_PER_LONG;
        const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
        const unsigned long space = ceiling - start;

        map += BIT_WORD(start);
        value &= GENMASK(nbits - 1, 0);

        if (space >= nbits) {
                *map &= ~(GENMASK(nbits - 1, 0) << offset);
                *map |= value << offset;
        } else {
                *map &= ~BITMAP_FIRST_WORD_MASK(start);
                *map |= value << offset;
                map++;
                *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
                *map |= value >> space;
        }
}

This avoids adding a costly conditional check inside bitmap_set_value()
when almost all bitmap_set_value() calls will have static arguments with
well-defined and obvious boundaries.

Do you think this would be an acceptable solution to resolve your GCC
warnings?
Unfortunately, it does not seem to make a difference, as gcc still
knows that this compiles to the same result, and it produces the same
warning as before (see https://godbolt.org/z/rjx34r)

         Arnd
Hi Arnd,

Sharing a different version of bitmap_set_valuei() function. See below.

Let me know if the below solution looks good to you and if it resolves
the above compiler warning.

@@ -1,5 +1,5 @@
 static inline void bitmap_set_value(unsigned long *map,
-                                    unsigned long value,
+                                    unsigned long value, const size_t length,
                                     unsigned long start, unsigned long nbits)
 {
         const size_t index = BIT_WORD(start);
@@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
         const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
         const unsigned long space = ceiling - start;

+       if (index >= length)
+               return;
+
         value &= GENMASK(nbits - 1, 0);

         if (space >= nbits) {
@@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
         } else {
                 map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
                 map[index + 0] |= value << offset;
+
+               if (index + 1 >= length)
+                       return;
+
                 map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
                 map[index + 1] |= value >> space;
         }
One of my concerns is that we're incurring the latency two additional
conditional checks just to suppress a compiler warning about a case that
wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
there's a way for us to suppress these warnings without adding onto the
latency of this function; given that bitmap_set_value() is intended to
be used in loops, conditionals here could significantly increase latency
in drivers.

I wonder if array_index_nospec() might have the side effect of
suppressing these warnings for us. For example, would this work:

static inline void bitmap_set_value(unsigned long *map,
                                  unsigned long value,
                                  unsigned long start, unsigned long nbits)
{
      const unsigned long offset = start % BITS_PER_LONG;
      const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
      const unsigned long space = ceiling - start;
      size_t index = BIT_WORD(start);

      value &= GENMASK(nbits - 1, 0);

      if (space >= nbits) {
              index = array_index_nospec(index, index + 1);

              map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
              map[index] |= value << offset;
      } else {
              index = array_index_nospec(index, index + 2);

              map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
              map[index + 0] |= value << offset;
              map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
              map[index + 1] |= value >> space;
      }
}

Or is this going to produce the same warning because we're not using an
explicit check against the map array size?

William Breathitt Gray
After testing my suggestion, it looks like the warnings are still
present. :-(

Something else I've also considered is perhaps using the GCC built-in
function __builtin_unreachable() instead of returning. So in Syed's code
we would have the following instead:

if (index + 1 >= length)
        __builtin_unreachable();

This might allow GCC to optimize better and avoid the conditional check
all together, thus avoiding latency while also hinting enough context to
the compiler to suppress the warnings.

William Breathitt Gray
I also thought of another optimization. Arnd, William, let me know
what you think about it.

Since exceeding the array limit is a rather rare event, we can use the
gcc extension: 'unlikely'  for the boundary checks.
We can use it at the two places where 'index' and 'index + 1' is being
checked against the boundary limit.

It might help optimize the code. Wouldn't it?

Syed Nayyar Waris

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help