Thread (13 messages) 13 messages, 5 authors, 2008-03-13

Re: PPC upstream kernel ignored DABR bug

From: Jan Kratochvil <hidden>
Date: 2007-11-28 12:45:36

On Wed, 28 Nov 2007 13:28:48 +0100, Arnd Bergmann wrote:
On Wednesday 28 November 2007, Jan Kratochvil wrote:
quoted
Please be aware DABR works fine if the same code runs just 1 (always) or
2 (sometimes) threads.  It starts failing with too many threads running:

$ ./dabr-lost
TID 32725: DABR 0x1001279f NIP 0xfecf41c
TID 32726: DABR 0x1001279f NIP 0xfecf41c
TID 32725: hitting the variable
variable found = -1, caught TID = 32725
TID 32726: hitting the variable
variable found = -1, caught TID = 32726
The kernel bug did not get reproduced - increase THREADS.

As I did not find any code in that kernel touching DABRX its value should not
be dependent on the number of threads running.
Right, this is a different problem from the one reported by Uli.
From what I can tell, your problem is that you set the DABR only
in one thread, so the other threads don't see it. DABR is saved
in the thread_struct, so setting it in one thread doesn't have
an impact on any other thread.
It even prints out above:
	TID 32725: DABR 0x1001279f NIP 0xfecf41c
	TID 32726: DABR 0x1001279f NIP 0xfecf41c

that it wrote DABR in both the threads and it has also successfully read it
back from each thread specifically (according to its thread-specific TID).

for (threadi = 0; threadi < THREADS; threadi++)
    {
      pid_t tid = thread[threadi];

      setup (tid);
...
    }
static void setup (pid_t tid)
{
...
  l = ptrace (PTRACE_SET_DEBUGREG, tid, NULL, (void *) dabr);
...
}

Also if I would not set DABR specifically for each thread it would not work in
90% of cases for `THREADS == 2'.  And it would not work for `THREADS == 4' if
they are busylooping (therefore not in a syscall).
	TID 596: DABR 0x100127a7 NIP 0x10000dbc
	TID 597: DABR 0x100127a7 NIP 0x10000db0
	TID 598: DABR 0x100127a7 NIP 0x10000dac
	TID 599: DABR 0x100127a7 NIP 0x10000dbc
	TID 596: hitting the variable
	variable found = -1, caught TID = 596
	TID 599: hitting the variable
	variable found = -1, caught TID = 599
	TID 597: hitting the variable
	variable found = -1, caught TID = 597
	TID 598: hitting the variable
	variable found = -1, caught TID = 598
	The kernel bug got workarounded by WORKAROUND_SET_DABR_IN_SYSCALL.

(I found out now WORKAROUND_SET_DABR_IN_SYSCALL only reduces the probability of
the failure, it is not a 100% workaround of the problem in the testcase.)


There is some tricky kernel code around it but I did not try to debug it:

struct task_struct *__switch_to(struct task_struct *prev,
	struct task_struct *new)
{
...
	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr)) {
		set_dabr(new->thread.dabr);
		__get_cpu_var(current_dabr) = new->thread.dabr;
	}
...
}



Regards,
Jan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help