Thread (33 messages) 33 messages, 8 authors, 2023-06-02

Re: system hang on start-up (mlx5?)

From: Linux regression tracking (Thorsten Leemhuis) <hidden>
Date: 2023-06-02 13:55:49
Also in: linux-rdma, regressions

On 02.06.23 15:38, Chuck Lever III wrote:
quoted
On Jun 2, 2023, at 7:05 AM, Linux regression tracking #update (Thorsten Leemhuis) [off-list ref] wrote:

[TLDR: This mail in primarily relevant for Linux regression tracking. A
change or fix related to the regression discussed in this thread was
posted or applied, but it did not use a Link: tag to point to the
report, as Linus and the documentation call for.
Linus recently stated he did not like Link: tags pointing to an
email thread on lore.
Afaik he strongly dislikes them when a Link: tag just points to the
submission of the patch being applied; at the same time he *really
wants* those links if they tell the backstory how a fix came into being,
which definitely includes the report about the issue being fixed (side
note: without those links regression tracking becomes so hard that it's
basically no feasible).

If my knowledge is not up to date, please if you have a minute do me a
favor and point me to Linus statement your refer to.
Also, checkpatch.pl is now complaining about Closes: tags instead
of Link: tags. A bug was never opened for this issue.
That was a change by somebody else, but FWIW, just use Closes: (instead
of Link:) with a link to the report on lore, that tag is not reserved
for bugs.

/me will go and update his boilerplate text used above
I did check the regzbot docs on how to mark this issue closed,
but didn't find a ready answer. Thank you for following up.
yw, but no worries, that's what I'm here for. :-D

Ciao, Thorsten
quoted
Things happen, no
worries -- but now the regression tracking bot needs to be told manually
about the fix. See link in footer if these mails annoy you.]

On 08.05.23 14:29, Linux regression tracking #adding (Thorsten Leemhuis)
wrote:
quoted
On 03.05.23 03:03, Chuck Lever III wrote:
quoted
I have a Supermicro X10SRA-F/X10SRA-F with a ConnectX®-5 EN network
interface card, 100GbE single-port QSFP28, PCIe3.0 x16, tall bracket;
MCX515A-CCAT

When booting a v6.3+ kernel, the boot process stops cold after a
few seconds. The last message on the console is the MLX5 driver
note about "PCIe slot advertised sufficient power (27W)".

bisect reports that bbac70c74183 ("net/mlx5: Use newer affinity
descriptor") is the first bad commit.

I've trolled lore a couple of times and haven't found any discussion
of this issue.
#regzbot ^introduced bbac70c74183
#regzbot title system hang on start-up (irq or mlx5 problem?)
#regzbot ignore-activity
#regzbot fix: 368591995d010e6
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
--
Chuck Lever
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help