Re: [PATCH net-next v3 1/3] rust: core abstractions for network PHY drivers
From: FUJITA Tomonori <fujita.tomonori@gmail.com>
Date: 2023-10-13 09:53:51
Also in:
rust-for-linux
On Fri, 13 Oct 2023 07:56:07 +0000 Benno Lossin [off-list ref] wrote:
quoted
btw, what's the purpose of using Rust in linux kernel? Creating sound Rust abstractions? Making linux kernel more reliable, or something else? For me, making linux kernel more reliable is the whole point. Thus I still can't understand the slogan that Rust abstractions can't trust subsystems.For me it is making the Linux kernel more reliable. The Rust abstractions are just a tool for that goal: we offload the difficult task of handling the C <-> Rust interactions and other `unsafe` features into those abstractions. Then driver authors do not need to concern themselves with that and can freely write drivers in safe Rust. Since there will be a lot more drivers than abstractions, this will pay off in the end, since we will have a lot less `unsafe` code than safe code. Concentrating the difficult/`unsafe` code in the abstractions make it easier to review (compared to `unsafe` code in every driver) and easier to maintain, if we find a soundness issue, we only have to fix it in the abstractions.
Agreed.
quoted
Rust abstractions always must check the validity of values that subsysmtes give because subsysmtes might give an invalid value. Like the enum state issue, if PHYLIB has a bug then give a random value, so the abstraction have to prevent the invalid value in Rust with validity checking. But with such critical bug, likely the system cannot continue to run anyway. Preventing the invalid state in Rust aren't useful much for system reliability.It's not that we do not trust the subsystems, for example when we register a callback `foo` and the C side documents that it is ok to sleep within `foo`, then we will assume so. If we would not trust the C side, then we would have to disallow sleeping there, since sleeping while holding a spinlock is UB (and the C side could accidentally be holding a spinlock). But there are certain things where we do not trust the subsystems, these are mainly things where we can afford it from a performance and usability perspective (in the example above we could not afford it from a usability perspective).
You need maintenance cost too here. That's exactly the discussion point during reviewing the enum code, the kinda cut-and-paste from C code and match code that Andrew and Grek want to avoid.
In the enum case it would also be incredibly simple for the C side to just make a slight mistake and set the integer to a value outside of the specified range. This strengthens the case for checking validity here. When an invalid value is given to Rust we have immediate UB. In Rust UB always means that anything can happen so we must avoid it at all costs.
I'm not sure the general rules in Rust can be applied to linux kernel. If the C side (PHYLIB) to set in an invalid value to the state, probably the network doesn't work; already anything can happen in the system at this point. Then the Rust abstractions get the invalid value from the C side and detect an error with a check. The abstractions return an error to a Rust PHY driver. Next what can the Rust PHY driver do? Stop working? Calling dev_err() to print something and then selects the state randomly and continue? What's the practical benefit from the check?
In this case having a check would not really hurt performance and in terms of usability it also seems reasonable. If it would be bad for performance, let us know.
Bad for maintenance cost. Please read the discussion in the review on rfc v1.