Thread (34 messages) 34 messages, 4 authors, 2025-05-29

Re: [PATCH v10 2/5] rust: support formatting of foreign types

From: "Benno Lossin" <lossin@kernel.org>
Date: 2025-05-26 23:01:27
Also in: dri-devel, linux-block, linux-devicetree, linux-kselftest, linux-pci, lkml, llvm, nouveau, rust-for-linux

On Tue May 27, 2025 at 12:17 AM CEST, Tamir Duberstein wrote:
On Mon, May 26, 2025 at 10:48 AM Benno Lossin [off-list ref] wrote:
quoted
On Sat May 24, 2025 at 10:33 PM CEST, Tamir Duberstein wrote:
quoted
Introduce a `fmt!` macro which wraps all arguments in
`kernel::fmt::Adapter` This enables formatting of foreign types (like
`core::ffi::CStr`) that do not implement `fmt::Display` due to concerns
around lossy conversions which do not apply in the kernel.

Replace all direct calls to `format_args!` with `fmt!`.

In preparation for replacing our `CStr` with `core::ffi::CStr`, move its
`fmt::Display` implementation to `kernel::fmt::Adapter<&CStr>`.

Suggested-by: Alice Ryhl <aliceryhl@google.com>
Link: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Custom.20formatting/with/516476467
Signed-off-by: Tamir Duberstein <redacted>
---
 drivers/block/rnull.rs      |   2 +-
 rust/kernel/block/mq.rs     |   2 +-
 rust/kernel/device.rs       |   2 +-
 rust/kernel/fmt.rs          |  77 +++++++++++++++++++++++++++++
 rust/kernel/kunit.rs        |   6 +--
 rust/kernel/lib.rs          |   1 +
 rust/kernel/prelude.rs      |   3 +-
 rust/kernel/print.rs        |   4 +-
 rust/kernel/seq_file.rs     |   2 +-
 rust/kernel/str.rs          |  23 ++++-----
 rust/macros/fmt.rs          | 118 ++++++++++++++++++++++++++++++++++++++++++++
 rust/macros/lib.rs          |  19 +++++++
 scripts/rustdoc_test_gen.rs |   2 +-
 13 files changed, 235 insertions(+), 26 deletions(-)
Can you split this into creating the proc-macro, forwarding the display
impls and replacing all the uses with the proc macro?
Can you help me understand why that's better?
It makes reviewing significantly easier.
quoted
quoted
+macro_rules! impl_display_forward {
+    ($(
+        $( { $($generics:tt)* } )? $ty:ty $( { where $($where:tt)* } )?
You don't need `{}` around the `where` clause, as a `where` keyword can
follow a `ty` fragment.
This doesn't work:
error: local ambiguity when calling macro `impl_display_forward`:
multiple parsing options: built-in NTs tt ('r#where') or 2 other
options.
  --> rust/kernel/fmt.rs:75:78
   |
75 |     {<T: ?Sized>} crate::sync::Arc<T> where crate::sync::Arc<T>:
fmt::Display,
   |
           ^
Ah right that's a shame, forgot about the `tt`s at the end...
quoted
quoted
+impl_display_forward!(
+    bool,
+    char,
+    core::panic::PanicInfo<'_>,
+    crate::str::BStr,
+    fmt::Arguments<'_>,
+    i128,
+    i16,
+    i32,
+    i64,
+    i8,
+    isize,
+    str,
+    u128,
+    u16,
+    u32,
+    u64,
+    u8,
+    usize,
+    {<T: ?Sized>} crate::sync::Arc<T> {where crate::sync::Arc<T>: fmt::Display},
+    {<T: ?Sized>} crate::sync::UniqueArc<T> {where crate::sync::UniqueArc<T>: fmt::Display},
+);
If we use `{}` instead of `()`, then we can format the contents
differently:

    impl_display_forward! {
        i8, i16, i32, i64, i128, isize,
        u8, u16, u32, u64, u128, usize,
        bool, char, str,
        crate::str::BStr,
        fmt::Arguments<'_>,
        core::panic::PanicInfo<'_>,
        {<T: ?Sized>} crate::sync::Arc<T> {where Self: fmt::Display},
        {<T: ?Sized>} crate::sync::UniqueArc<T> {where Self: fmt::Display},
    }
Is that formatting better? rustfmt refuses to touch it either way.
Yeah rustfmt doesn't touch macro parameters enclosed in `{}`. I think
it's better.
quoted
quoted
+/// Please see [`crate::fmt`] for documentation.
+pub(crate) fn fmt(input: TokenStream) -> TokenStream {
+    let mut input = input.into_iter();
+
+    let first_opt = input.next();
+    let first_owned_str;
+    let mut names = BTreeSet::new();
+    let first_lit = {
+        let Some((mut first_str, first_lit)) = (match first_opt.as_ref() {
+            Some(TokenTree::Literal(first_lit)) => {
+                first_owned_str = first_lit.to_string();
+                Some(first_owned_str.as_str()).and_then(|first| {
+                    let first = first.strip_prefix('"')?;
+                    let first = first.strip_suffix('"')?;
+                    Some((first, first_lit))
+                })
+            }
+            _ => None,
+        }) else {
+            return first_opt.into_iter().chain(input).collect();
+        };
This usage of let-else + match is pretty confusing and could just be a
single match statement.
I don't think so. Can you try rewriting it into the form you like?
    let (mut first_str, first_lit) match first_opt.as_ref() {
        Some(TokenTree::Literal(lit)) if lit.to_string().starts_with('"') => {
            let contents = lit.to_string();
            let contents = contents.strip_prefix('"').unwrap().strip_suffix('"').unwrap();
            ((contents, lit))
        }
        _ => return first_opt.into_iter().chain(input).collect(),
    };
quoted
quoted
+        while let Some((_, rest)) = first_str.split_once('{') {
+            first_str = rest;
+            if let Some(rest) = first_str.strip_prefix('{') {
+                first_str = rest;
+                continue;
+            }
+            while let Some((name, rest)) = first_str.split_once('}') {
+                first_str = rest;
+                if let Some(rest) = first_str.strip_prefix('}') {
This doesn't make sense, we've matched a `{`, some text and a `}`. You
can't escape a `}` that is associated to a `{`.
Sure, but such input would be malformed, so I don't think it's
necessary to handle it "perfectly". We'll get a nice error from
format_args anyhow.
My suggestion in this case would be to just remove this if-let. The
search for `{` above would skip the `}` if it's correct.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=5f529d93da7cf46b3c99ba7772623e33
Yes it will error like that, but if we do the replacement only when the
syntax is correct, there also will be compile errors because of a
missing `Display` impl, or is that not the case?

I'm a bit concerned about the ergonomics that this change will
introduce, but I guess there really isn't anything that we can do about
except not do it.
quoted
quoted
+                    first_str = rest;
+                    continue;
+                }
+                let name = name.split_once(':').map_or(name, |(name, _)| name);
+                if !name.is_empty() && !name.chars().all(|c| c.is_ascii_digit()) {
+                    names.insert(name);
+                }
+                break;
+            }
+        }
+        first_lit
`first_lit` is not modified, so could we just the code above it into a
block instead of keeping it in the expr for `first_lit`?
As above, can you suggest the alternate form you like better? The
gymnastics here are all in service of being able to let malformed
input fall through to core::format_args which will do the hard work of
producing good diagnostics.
I don't see how this is hard, just do:

    let (first_str, first_lit) = ...;

    while ...
quoted
quoted
+    };
+
+    let first_span = first_lit.span();
+    let adapt = |expr| {
+        let mut borrow =
+            TokenStream::from_iter([TokenTree::Punct(Punct::new('&', Spacing::Alone))]);
+        borrow.extend(expr);
+        make_ident(first_span, ["kernel", "fmt", "Adapter"])
+            .chain([TokenTree::Group(Group::new(Delimiter::Parenthesis, borrow))])
This should be fine with using `quote!`:

    quote!(::kernel::fmt::Adapter(&#expr))
Yeah, I have a local commit that uses quote_spanned to remove all the
manual constructions.
I don't think that you need `quote_spanned` here at all. If you do, then
let me know, something weird with spans is going on then.
quoted
quoted
+    };
+
+    let flush = |args: &mut TokenStream, current: &mut TokenStream| {
+        let current = std::mem::take(current);
+        if !current.is_empty() {
+            args.extend(adapt(current));
+        }
+    };
+
+    let mut args = TokenStream::from_iter(first_opt);
+    {
+        let mut current = TokenStream::new();
+        for tt in input {
+            match &tt {
+                TokenTree::Punct(p) => match p.as_char() {
+                    ',' => {
+                        flush(&mut args, &mut current);
+                        &mut args
+                    }
+                    '=' => {
+                        names.remove(current.to_string().as_str());
+                        args.extend(std::mem::take(&mut current));
+                        &mut args
+                    }
+                    _ => &mut current,
+                },
+                _ => &mut current,
+            }
+            .extend([tt]);
+        }
This doesn't handle the following code correctly ):

    let mut a = 0;
    pr_info!("{a:?}", a = a = a);

Looks like we'll have to remember what "kind" of an equals we parsed...
Hmm, good point. Maybe we can just avoid dealing with `=` at all until
we hit the `,` and just split on the leftmost `=`. WDYT? I'll have
that in v11.
Sounds good, if there is no `=`, then ignore it.
quoted
quoted
+/// Like [`core::format_args!`], but automatically wraps arguments in [`kernel::fmt::Adapter`].
+///
+/// This macro allows generating `core::fmt::Arguments` while ensuring that each argument is wrapped
+/// with `::kernel::fmt::Adapter`, which customizes formatting behavior for kernel logging.
+///
+/// Named arguments used in the format string (e.g. `{foo}`) are detected and resolved from local
+/// bindings. All positional and named arguments are automatically wrapped.
+///
+/// This macro is an implementation detail of other kernel logging macros like [`pr_info!`] and
+/// should not typically be used directly.
+///
+/// [`kernel::fmt::Adapter`]: ../kernel/fmt/struct.Adapter.html
+/// [`pr_info!`]: ../kernel/macro.pr_info.html
+#[proc_macro]
+pub fn fmt(input: TokenStream) -> TokenStream {
I'm wondering if we should name this `format_args` instead in order to
better communicate that it's a replacement for `core::format_args!`.
Unfortunately that introduces ambiguity in cases where
kernel::prelude::* is imported because core::format_args is in core's
prelude.
Ahh that's unfortunate.

---
Cheers,
Benno
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help