Thread (130 messages) 130 messages, 8 authors, 2016-05-18

[RFC6 PATCH v6 00/21] ILP32 for ARM64 - LTP results

From: Andrew Pinski <hidden>
Date: 2016-04-27 07:30:17
Also in: linux-arch, linux-s390, lkml

On Fri, Apr 22, 2016 at 8:37 PM, Zhangjian (Bamvor)
[off-list ref] wrote:
Hi, Yury


On 2016/4/6 6:44, Yury Norov wrote:
quoted
There are about 20 failing tests of 782 in lite scenario.
float_bessel
float_exp_log
float_iperb
float_power
float_trigo
pipeio_1
pipeio_3
pipeio_5
pipeio_8
abort01
clone02
kill11
mmap16
open12
pause01
rename11
rmdir02
umount2_01
umount2_02
umount2_03
utime06
mtest06

The list is rough because some tests fail not every time.

Tests abort01 and kill11 fail for lp64 too, so maybe there's
a reason unrelated to ilp32 itself.

float_xxx tests fail because they call unwind() from signal context,
and GCC for ilp32 has problem with it, as Andrew told.
Is there some progress about this issue. When we talk about unwind
functions, do you mean the function in libgcc?

We encountered another issue(abort not segfault) which also called
pthread_cancel(). The test code is in the attachment. Here is the
backtrace:
Yes this was a known issue I knew about.  I have a patch GCC to fix
this.  Basically REG_VALUE_IN_UNWIND_CONTEXT needs to be defined while
building libgcc to support the correct unwind information.
I will be posting a GCC patch to fix this tomorrow.  This was a bug
even in the original set of ilp32 patches.  I only finally was able to
sit down and fix it today.


Thanks,
Andrew
Program received signal SIGABRT, Aborted.
[Switching to Thread 0xf77ee330 (LWP 2958)]
0x000000000040f5bc in raise (sig=sig at entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:55
55      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x000000000040f5bc in raise (sig=sig at entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x000000000040f884 in abort () at abort.c:89

#2  0x00000000004073b4 in uw_update_context_1 (
    context=context at entry=0xf77ec820, fs=fs at entry=0xf77ebec8)
at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1430

#3  0x00000000004078c0 in uw_update_context
(context=context at entry=0xf77ec820,
    fs=fs at entry=0xf77ebec8)
   at
/home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1506
#4  0x0000000000407a9c in uw_advance_context (fs=0xf77ebec8,
    context=0xf77ec820)
    at
/home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1529
#5  _Unwind_ForcedUnwind_Phase2 (exc=exc at entry=0xf77ee580,
    context=context at entry=0xf77ec820)
    at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:185
#6  0x0000000000408228 in _Unwind_ForcedUnwind (exc=0xf77ee580,
    stop=stop at entry=0x405440 <unwind_stop>, stop_argument=0xf77eddd8)
    at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:207
#7  0x00000000004055c4 in __pthread_unwind (buf=<optimized out>)
    at unwind.c:126
#8  0x00000000004050b4 in __do_cancel () at ./pthreadP.h:283
#9  sigcancel_handler (sig=<optimized out>, si=<optimized out>,
    ctx=<optimized out>) at nptl-init.c:225
---Type <return> to continue, or q <return> to quit---
#10 <signal handler called>

#11 0x0000000000000000 in ?? ()

#12 0x0000000000423084 in __select (nfds=-66661, readfds=<optimized out>,
    writefds=<optimized out>, exceptfds=<optimized out>, timeout=0x0)
    at ../sysdeps/unix/sysv/linux/generic/select.c:45
#13 0x0000000000400604 in TEST_TaskDelay (
    uiMillSecs=<error reading variable: can't compute CFA for this frame>)
    at test-cancel.c:18
#14 0x0000000000400680 in printids (
    s=<error reading variable: can't compute CFA for this frame>)
    at test-cancel.c:38
#15 0x00000000004006d0 in thr_fn (
    arg=<error reading variable: can't compute CFA for this frame>)
    at test-cancel.c:49
#16 0x0000000000401b28 in start_thread (arg=0x4a3000) at
pthread_create.c:335
#17 0x0000000000401b28 in start_thread (arg=0x4a3000) at
pthread_create.c:335
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Such abort is raise by the following code:
static void
uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState
*fs)
{
//...
  /* Compute this frame's CFA.  */
  switch (fs->regs.cfa_how)
    {
    case CFA_REG_OFFSET:
      cfa = _Unwind_GetPtr (&orig_context, fs->regs.cfa_reg);
      cfa += fs->regs.cfa_offset;
      break;

    case CFA_EXP:
      {
        const unsigned char *exp = fs->regs.cfa_exp;
        _uleb128_t len;

        exp = read_uleb128 (exp, &len);
        cfa = (void *) (_Unwind_Ptr)
          execute_stack_op (exp, exp + len, &orig_context, 0);
        break;
      }

    default:
      gcc_unreachable ();
    }
  context->cfa = cfa;
//...
}
``

Any suggestion is appreciated.

CC gcc mailing list. Sorry if it is off topic.

Regards

Bamvor




> pipeio_x tests are very unstable and may fail randomly. I strongly
> suspect race conditions, as they all work like a charm if pinned to
> single CPU with taskset. Probably, race is the reason of clone02 too.
> Though I'm not sure, is the race in kernel, glibc or test itself.
>
> But I know for sure that pause01 fails due to test design:
>         if (setitimer(ITIMER_REAL, &it, NULL)) // For 1000us
>                 tst_brkm(TBROK | TERRNO, NULL, "setitimer() failed");
>
>         TEST(pause());
>
> As setitimer() and pause() calls are not atomic, alarm may come before
> pause()
> is called, and be silently dropped by the handler. Next pause() call hangs
> test forever. I already reported to LTP list.
>
> open12, rename11, rmdir02, mmap16, mtest06 - all call mkfs tool, and it
> returns
> error code. I didn't investigate it much yet.
>
> umount02_x, utime06 - cannot reproduce out of scenario, even run it in
> infinite
> loop - they work fine.
>
> Full test log is attached.
>
> Yury
>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help