[PATCH v2] PR_*ET_THP_DISABLE.2const: document addition of PR_THP_DISABLE_EXCEPT_ADVISED
From: Usama Arif <hidden>
Date: 2025-11-05 13:48:16
Also in:
lkml
Subsystem:
the rest · Maintainer:
Linus Torvalds
PR_THP_DISABLE_EXCEPT_ADVISED extended PR_SET_THP_DISABLE to only provide THPs when advised. IOW, it allows individual processes to opt-out of THP = "always" into THP = "madvise", without affecting other workloads on the system. The series has been merged in [1]. Before [1], the following 2 calls were allowed with PR_SET_THP_DISABLE: prctl(PR_SET_THP_DISABLE, 0, 0, 0, 0); // to reset THP setting. prctl(PR_SET_THP_DISABLE, 1, 0, 0, 0); // to disable THPs completely. Now in addition to the 2 calls above, you can do: prctl(PR_SET_THP_DISABLE, 1, PR_THP_DISABLE_EXCEPT_ADVISED, 0, 0); // to disable THPs except madvise. This patch documents the changes introduced due to the addition of PR_THP_DISABLE_EXCEPT_ADVISED flag: - PR_GET_THP_DISABLE returns a value whose bits indicate how THP-disable is configured for the calling thread (with or without PR_THP_DISABLE_EXCEPT_ADVISED). - PR_SET_THP_DISABLE now uses arg3 to specify whether to disable THP completely for the process, or disable except madvise (PR_THP_DISABLE_EXCEPT_ADVISED). [1] https://github.com/torvalds/linux/commit/9dc21bbd62edeae6f63e6f25e1edb7167452457b Signed-off-by: Usama Arif <redacted> --- v1 -> v2 (Alejandro Colomar): - Fixed double negation on when MADV_HUGEPAGE will succeed - Turn return values of PR_GET_THP_DISABLE into a table - Turn madvise calls into full italics - Use semantic newlines --- man/man2/madvise.2 | 6 ++- man/man2const/PR_GET_THP_DISABLE.2const | 20 +++++++--- man/man2const/PR_SET_THP_DISABLE.2const | 52 +++++++++++++++++++++---- 3 files changed, 64 insertions(+), 14 deletions(-)
diff --git a/man/man2/madvise.2 b/man/man2/madvise.2
index 7a4310c40..55c6f4a6c 100644
--- a/man/man2/madvise.2
+++ b/man/man2/madvise.2@@ -372,9 +372,11 @@ or .BR VM_PFNMAP , nor can it be stack memory or backed by a DAX-enabled device (unless the DAX device is hot-plugged as System RAM). -The process must also not have +The process can have .B PR_SET_THP_DISABLE -set (see +set only if +.B PR_THP_DISABLE_EXCEPT_ADVISED +flag is set (see .BR prctl (2)). .IP The
diff --git a/man/man2const/PR_GET_THP_DISABLE.2const b/man/man2const/PR_GET_THP_DISABLE.2const
index 38ff3b370..d63cff21c 100644
--- a/man/man2const/PR_GET_THP_DISABLE.2const
+++ b/man/man2const/PR_GET_THP_DISABLE.2const@@ -6,7 +6,7 @@ .SH NAME PR_GET_THP_DISABLE \- -get the state of the "THP disable" flag for the calling thread +get the state of the "THP disable" flags for the calling thread .SH LIBRARY Standard C library .RI ( libc ,\~ \-lc )
@@ -18,13 +18,23 @@ Standard C library .B int prctl(PR_GET_THP_DISABLE, 0L, 0L, 0L, 0L); .fi .SH DESCRIPTION -Return the current setting of -the "THP disable" flag for the calling thread: -either 1, if the flag is set, or 0, if it is not. +Return a value whose bits indicate how THP-disable is configured +for the calling thread. +The returned value is interpreted as follows: +.P +.TS +allbox; +cb cb cb l +c c c l. +Bit 1 Bit 0 Value Description +0 0 0 No THP-disable behaviour specified. +0 1 1 THP is entirely disabled for this process. +1 1 3 THP-except-advised mode is set for this process. +.TE .SH RETURN VALUE On success, .BR PR_GET_THP_DISABLE , -returns the boolean value described above. +returns the value described above. On error, \-1 is returned, and .I errno is set to indicate the error.
diff --git a/man/man2const/PR_SET_THP_DISABLE.2const b/man/man2const/PR_SET_THP_DISABLE.2const
index 532beac66..75e17fa6a 100644
--- a/man/man2const/PR_SET_THP_DISABLE.2const
+++ b/man/man2const/PR_SET_THP_DISABLE.2const@@ -6,7 +6,7 @@ .SH NAME PR_SET_THP_DISABLE \- -set the state of the "THP disable" flag for the calling thread +set the state of the "THP disable" flags for the calling thread .SH LIBRARY Standard C library .RI ( libc ,\~ \-lc )
@@ -15,15 +15,20 @@ Standard C library .BR "#include <linux/prctl.h>" " /* Definition of " PR_* " constants */" .B #include <sys/prctl.h> .P -.BI "int prctl(PR_SET_THP_DISABLE, long " flag ", 0L, 0L, 0L);" +.BI "int prctl(PR_SET_THP_DISABLE, long " thp_disable ", unsigned long " flags ", 0L, 0L);" .fi .SH DESCRIPTION -Set the state of the "THP disable" flag for the calling thread. +Set the state of the "THP disable" flags for the calling thread. If -.I flag -has a nonzero value, the flag is set, otherwise it is cleared. +.I thp_disable +has a nonzero value, +the THP disable flag is set according to the value of +.I flags, +otherwise it is cleared. .P -Setting this flag provides a method +This +.BR prctl (2) +provides a method for disabling transparent huge pages for jobs where the code cannot be modified, and using a
@@ -31,10 +36,43 @@ and using a hook with .BR madvise (2) is not an option (i.e., statically allocated data). -The setting of the "THP disable" flag is inherited by a child created via +The setting of the "THP disable" flags is inherited by a child created via .BR fork (2) and is preserved across .BR execve (2). +.P +The behavior depends on the value of +.IR flags: +.TP +.B 0 +The +.BR prctl (2) +call will disable THPs completely for the process, +irrespective of global THP controls or +.BR MADV_COLLAPSE . +.TP +.B PR_THP_DISABLE_EXCEPT_ADVISED +The +.BR prctl (2) +call will disable THPs for the process +except when the usage of THPs is +advised. +Consequently, THPs will only be used when: +.RS +.IP \[bu] 3 +Global THP controls are set to "always" or "madvise" and +.I \%madvise(...,\~MADV_HUGEPAGE) +or +.I \%madvise(...,\~MADV_COLLAPSE) +is used. +.IP \[bu] +Global THP controls are set to "never" and +.I \%madvise(...,\~MADV_COLLAPSE) +is used. +This is the same behavior +as if THPs would not be disabled on +a process level. +.RE .SH RETURN VALUE On success, 0 is returned.
--
2.47.3