Re: [PATCH] glibc: Remove CPU set size checking from affinity functions [BZ #19143]
From: Florian Weimer <hidden>
Date: 2015-10-23 14:00:34
Also in:
cgroups
On 10/19/2015 07:23 PM, Carlos O'Donell wrote:
quoted
The current situation, briefly stated, is this: glibc tries to guess the kernel CPU set size and rejects attempts to specify an affinity mask which is larger than that, but it does not work, and glibc and the kernel still silently accept CPU affinity masks with invalid bits, without returning an error. The glibc check does not provide any value to applications, it just adds pointless complexity to the library. Therefore, I want to remove it from glibc.The point of the code you want to remove was to detect the case where the user set CPU bits outside of the maximum possible number of supported CPUs. AFAIK today that value is CONFIG_NR_CPUS and the various variables that derive from that value.
CONFIG_NR_CPUS is the absolute maximum for a specific kernel (available at run time in /sys/devices/system/cpu/kernel_max). The kernel lowers the observable value if it detects the system cannot support more than a specific number of CPUs. This is the kernel-internal nr_cpu_ids variable. I don't think its value is directly exported, but it can currently be derived from /sys/devices/system/cpu/possible. (The difficulty of obtaining this value, and the tendency of the kernel to replace hard compile-time limits with run-time configuration options, makes me think it is unwise to expose these values through sysconf.)
Can you elaborate some more on any of the false negative cases you think might impact user applications? For example what happens if you use the stock cpu_set_t size? It would seem to me that such a use keeps working and there is no change.
Yes, applications which worked before continue to work. But you can now specify an all-ones mask you have allocated, and glibc will not reject it because it has set bits beyond the value it guessed for nr_cpu_ids.
quoted
Remove CPU set size checking from affinity functions [BZ #19143] With current kernel versions, the check does not reliably detect that unavailable CPUs are requested, for these reasons: (1) The kernel will silently ignore non-allowed CPUs.You mean to say that if sched_setaffinity is called with a CPU mask bit set to enabled, but that cpu is not allowed for the process, then it will ignore the setting?
Yes, that is what the kernel does.
This requires you run sched_getaffinity to verify what CPUs you're actually set to run on?
Yes, if you care about this detail.
Is this because the cpuset mechanism is merged with sched_setaffinity and overrides it?
As far as I can tell, yes. I do not know where the kernel gets the other mask from, but it seems cgroups-related (hence the Cc:), and it does and AND on those two masks.
quoted
(3) The existing probing code assumes that the CPU mask size is a power of two and at least 1024. Neither it has to be a power of two, nor is the minimum possible value 1024, so the value determined is often too large, resulting in incorrect false negatives.Could you explain those "false negative" cases again?
I tried to make this clearer in the revised commit message of the attached patch.
The goal is to keep nptl/ free from linux-isms, and AFAICT your test is indeed free of any linux-specific features since your code for finding the size of the cpu mask is generic. Is there anything I might have missed that would make your test linux-specific? Keep in mind that we share nptl/ with the nacl port.
Good point. sched_getcpu is not universally available, so I had to move the tests to sysdeps/unix/sysv/linux. I noticed that tst-getcpu was failing (bug 19164), so I added another sched_setaffinity test variant that supersedes it.
quoted
+* sched_setaffinity, pthread_setaffinity_np no longer attempt to guess the + kernel-internal CPU set size. This means that requests that change the + CPU affinity which failed before will now succeed. Applications that needPlease provide at least one example of a failure which now succeeds.
I've updated the NEWS entry.
quoted
+/* We wave two loops running for two seconds each. */ +#define TIMEOUT 8Why two seconds? Why two threads? If the value of 2 seconds is arbitrary please state so in a comment such that future reviewers can adjust it as they see fit without having to review the history of the value.
Should be clearer in the new version.
quoted
+++ b/posix/tst-affinity.c@@ -0,0 +1,254 @@Needs a one line test description with BZ#.
I did not include the bug number because it is a generic, non-regression test.
quoted
+static int +find_set_size (void) +{ + /* We need to use multiples of 64 because otherwise, CPU_ALLOC + over-allocates, and and we do not see all bits returned by the + kernel. */How does CPU_ALLOC over-allocating result in not seeing bits from the kernel? Is this because you get over-allocation in CPU_ALLOC, but your own external count of num_cpus would be lower and it's num_cpus you return? Can't you rely on the result of CPU_ALLOC_SIZE and return that?
See find_last_cpu. It is tricky to determine the actual size of a CPU set just based on the macros. I think I got it right. It turns out the comment was incorrect, I changed the code.
Why do we do this instead of calling sysconf to get the number of CPUs?
sysconf currently does not give us the proper number, and I really don't think applications should rely on it. This is a separate conversation, IMHO. I added a comment.
quoted
+ +static bool +test_size (const struct conf *conf, size_t size) +{Should print PASS:/FAIL: prefix to make grepping easier.
I added info/warning/error prefixes instead.
Should do what other tests do e.g. err |= and run each test on a distinct line. Makes it easier to add tests and disable tests while debugging.
Ah, right. I have added tests which cover additional scenarios (cross-process sched_* calls and cross-thread pthread_sched* calls). Florian
Attachments
- 0001-Remove-CPU-set-size-checking-from-affinity-functions.patch [text/x-patch] 44653 bytes · preview