Re: [PATCH v10] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests
From: Christophe Leroy <hidden>
Date: 2024-02-27 18:35:07
Also in:
lkml
Le 27/02/2024 à 19:21, Charlie Jenkins a écrit :
On Tue, Feb 27, 2024 at 06:11:24PM +0000, Christophe Leroy wrote:quoted
Le 27/02/2024 à 18:54, Charlie Jenkins a écrit :quoted
On Tue, Feb 27, 2024 at 11:32:19AM +0000, Christophe Leroy wrote:quoted
Le 27/02/2024 à 11:28, Russell King (Oracle) a écrit :quoted
On Tue, Feb 27, 2024 at 06:47:38AM +0000, Christophe Leroy wrote:quoted
Le 27/02/2024 à 00:48, Guenter Roeck a écrit :quoted
On 2/26/24 15:17, Charlie Jenkins wrote:quoted
On Mon, Feb 26, 2024 at 10:33:56PM +0000, David Laight wrote:quoted
...quoted
I think you misunderstand. "NET_IP_ALIGN offset is what the kernel defines to be supported" is a gross misinterpretation. It is not "defined to be supported" at all. It is the _preferred_ alignment nothing more, nothing less.This distinction is arbitrary in practice, but I am open to being proven wrong if you have data to back up this statement. If the driver chooses to not follow this, then the driver might not work. ARM defines the NET_IP_ALIGN to be 2 to pad out the header to be on the supported alignment. If the driver chooses to pad with one byte instead of 2 bytes, the driver may fail to work as the CPU may stall after the misaligned access.quoted
I'm sure I've seen code that would realign IP headers to a 4 byte boundary before processing them - but that might not have been in Linux. I'm also sure there are cpu which will fault double length misaligned memory transfers - which might be used to marginally speed up code. Assuming more than 4 byte alignment for the IP header is likely 'wishful thinking'. There is plenty of ethernet hardware that can only write frames to even boundaries and plenty of cpu that fault misaligned accesses. There are even cases of both on the same silicon die. You also pretty much never want a fault handler to fixup misaligned ethernet frames (or really anything else for that matter). It is always going to be better to check in the code itself. x86 has just made people 'sloppy' :-) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)If somebody has a solution they deem to be better, I am happy to change this test case. Otherwise, I would appreciate a maintainer resolving this discussion and apply this fix.Agreed. I do have a couple of patches which add explicit unaligned tests as well as corner case tests (which are intended to trigger as many carry overflows as possible). Once I get those working reliably, I'll be happy to submit them as additional tests.The functions definitely have to work at least with and without VLAN, which means the alignment cannot be greater than 4 bytes. That's also the outcome of the discussion.Thanks for completely ignoring what I've said. No. The alignment ends up being commonly 2 bytes. As I've said several times, network drivers do _not_ have to respect NET_IP_ALIGN. There are 32-bit ARM drivers which have a DMA engine in them which can only DMA to a 32-bit aligned address. This means that the start of the ethernet header is placed at a 32-bit aligned address making the IP header misaligned to 32-bit. I don't see what is so difficult to understand about this... but it seems that my comments on this are being ignored time and time again, and I can only think that those who are ignoring my comments have some alterior motive here.I'm sorry for this misunderstanding. I'm not ignoring what you said at all. I understood that ARM is able to handle unaligned accesses with some exception handlers at worst case and that DMA constraints may lead to the IP header beeing on a 2 bytes alignment only. However I also understood from others that some architectures can't handle such a 2 bytes only alignments. It's been suggested during the discussion that alignment tests should be added later in a follow-up patch. So for the time being I'm trying to find a compromise and get the existing tests working on all platforms but with a smaller alignment than the 16-bytes alignment brought by Charlie's v10 patch. And a 4 bytes alignment seemed to me to be a good compromise for this fix. The idea is also to make the fix as minimal as possible, unlike Charlie's patch that is churning up the tests quite heavily.Do you have a list of platforms this is failing on? I haven't seen any reports that haven't been fixed.I don't have such a list, but I guess you do ? If all platforms have already been fixed, why are you sending this patch at all ?This patch is what is doing the "fixing". Over the course of 10 versions I have "fixed" the test cases to work on platforms that have various alignment and endianness constraints. The endianness changes were picked off of these patches and spun out into a different patch by you. I originally introduced these two new test cases since I wrote the riscv checksum function implementations and these tests were helpful for me and I figured they may be helpful for somebody else too.
I see. Then you mis-understood. I don't say your patch leave any platform unfixed. I say that your patch seems bigger than required, it is a churn. In addition your patch assumes an alignment of 16-bytes which, as explained by Russell, it just wrong. At least an alignment of 4 bytes must work on any platforms because of VLANs. Christophe _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel