Re[2]: [PATCH][v2] fork_init: fix division by zero
From: Yuri Tikhonov <hidden>
Date: 2008-12-11 22:22:48
Also in:
lkml
=0D=0AHello Andrew, On Thursday, December 11, 2008 you wrote: [snip]
The expression you've chosen here can be quite inacccurate, because ((PAGE_SIZE / (8 * THREAD_SIZE)) is a small number.=20
But why is it bad? We do multiplication to 'mempages', not division.=20 All the numbers in the multiplier are the power of 2, so both=20 expressions: mempages * (PAGE_SIZE / (8 * THREAD_SIZE)) and max_threads =3D (mempages * PAGE_SIZE) / (8 * THREAD_SIZE) are finally equal.=20
The way to preserve accuracy is
max_threads =3D (mempages * PAGE_SIZE) / (8 * THREAD_SIZE);
so how about avoiding the nasty ifdefs and doing
I'm OK with the approach below, but, leading resulting to the same,=20 this involves some overhead to the code where there was no this=20 overhead before this patch: e.g. your implementation is finally boils=20 down to ~5 times more processor instructions than there were before, plus operations with stack for the 'm' variable. On the other hand, my approach with nasty (I agree) ifdefs doesn't=20 lead to overheads to the code which does not need this: i.e. the most=20 common situation of small PAGE_SIZEs. Big PAGE_SIZE is the exception,=20 so I believe that the more common cases should not suffer because of=20 this.
quoted hunk ↗ jump to hunk
--- a/kernel/fork.c~fork_init-fix-division-by-zero +++ a/kernel/fork.c@@ -69,6 +69,7 @@ #include <asm/mmu_context.h> #include <asm/cacheflush.h> #include <asm/tlbflush.h> +#include <asm/div64.h>=20 /* * Protected counters by write_lock_irq(&tasklist_lock)@@ -185,10 +186,15 @@ void __init fork_init(unsigned long memp=20 /* * The default maximum number of threads is set to a safe - * value: the thread structures can take up at most half - * of memory. + * value: the thread structures can take up at most + * (1/8) part of memory. */ - max_threads =3D mempages / (8 * THREAD_SIZE / PAGE_SIZE); + { + /* max_threads =3D (mempages * PAGE_SIZE) / THREAD_SIZE /=
8; */
+ u64 m =3D mempages * PAGE_SIZE;
+ do_div(m, THREAD_SIZE * 8);
+ max_threads =3D m;
+ }
=20
/*
* we need to allow at least 20 threads to boot a system
_?
The code is also inaccurate because it assumes that <whatever allocator
is used for threads>> will pack the thread_structs into pages with best
possible density, which isn't necessarily the case. Let's not worry about that.
OT:
max_threads is widly wrong anyway.
- the caller passes in num_physpages, which includes highmem. And we can't allocate thread structs from highmem.
- num_physpages includes kernel pages and other stuff which can never be allocated via the page allocator.
A suitable fix would be to switch the caller to the strangely-named nr_free_buffer_pages().
If you grep the tree for `num_physpages', you will find a splendid number of similar bugs. num_physpages should be unexported, burnt, deleted, etc. It's just an invitation to write buggy code.
Regards, Yuri -- Yuri Tikhonov, Senior Software Engineer Emcraft Systems, www.emcraft.com