Thread (12 messages) 12 messages, 6 authors, 2023-06-29

Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)

From: Linux regression tracking #adding (Thorsten Leemhuis) <hidden>
Date: 2023-06-15 12:05:53
Also in: lkml, regressions

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 14.06.23 17:12, Sachin Sant wrote:
Following crash is observed during a kexec operation on 
IBM Power10 server:

[ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 0)
[ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x00000050
[ 34.381565] Faulting instruction address: 0xc0000000009db1e4
[ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
[ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-00037-gb6dad5178cea #3
[ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 34.381621] NIP: c0000000009db1e4 LR: c0000000009db928 CTR: c0000000009eab60
[ 34.381625] REGS: c00000009742f780 TRAP: 0300 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
[ 34.381628] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44488884 XER: 00000001
[ 34.381638] CFAR: c0000000009db19c DAR: 0000000000000050 DSISR: 40000000 IRQMASK: 0 
[ 34.381638] GPR00: c0000000009db928 c00000009742fa20 c0000000014a1500 c0000000081d0000 
[ 34.381638] GPR04: c00000000d842c50 c00000000d842c50 0000000000000025 fffffffffffe0000 
[ 34.381638] GPR08: 0000000000000000 0000000000000000 0000000000000009 c008000000785280 
[ 34.381638] GPR12: c0000000009eab60 c00000135fab7f00 0000000000000000 0000000000000000 
[ 34.381638] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381638] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381638] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002e21e08 
[ 34.381638] GPR28: c00000000d842c48 c000000002a02208 c00000000321c0c0 c0000000081d0000 
[ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
[ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381685] Call Trace:
[ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
[ 34.381695] [c00000009742fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
[ 34.381701] [c00000009742fb90] [c000000000a01ecc] device_shutdown+0x21c/0x39c
[ 34.381705] [c00000009742fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
[ 34.381710] [c00000009742fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
[ 34.381714] [c00000009742fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
[ 34.381718] [c00000009742fe10] [c000000000034adc] system_call_exception+0x13c/0x340
[ 34.381723] [c00000009742fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[ 34.381729] --- interrupt: 3000 at 0x7fff9c5459f0
[ 34.381732] NIP: 00007fff9c5459f0 LR: 0000000000000000 CTR: 0000000000000000
[ 34.381735] REGS: c00000009742fe80 TRAP: 3000 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
[ 34.381738] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42422884 XER: 00000000
[ 34.381747] IRQMASK: 0 
[ 34.381747] GPR00: 0000000000000058 00007ffffad83d70 000000012fc47f00 fffffffffee1dead 
[ 34.381747] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003 
[ 34.381747] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381747] GPR12: 0000000000000000 00007fff9c7bb2c0 000000012fc3f598 0000000000000000 
[ 34.381747] GPR16: ffffffffffffffff 0000000000000000 000000012fc1fcc0 0000000000000000 
[ 34.381747] GPR20: 0000000000008913 0000000000008914 000000014b891020 0000000000000003 
[ 34.381747] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007ffffad83ef0 
[ 34.381747] GPR28: 000000012fc19f10 00007fff9c6419c0 000000014b891080 000000014b891040 
[ 34.381781] NIP [00007fff9c5459f0] 0x7fff9c5459f0
[ 34.381784] LR [0000000000000000] 0x0
[ 34.381786] --- interrupt: 3000
[ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
[ 34.381800] ---[ end trace 0000000000000000 ]---
[ 34.384090] pstore: backend (nvram) writing error (-1)

Git bisect points to following patch

commit bd8621ca1510e6e802df9855bdc35a04a3cfa932
    tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site

Reverting the commit allows a successful kexec operation.
Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced bd8621ca1510e6e802df9855bdc35a04a3cfa932
#regzbot title tpm/ppc: crash during a kexec
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help