Thread (30 messages) 30 messages, 7 authors, 2020-05-20

Re: [PATCH v7 1/4] x86: kdump: move reserve_crashkernel_low() into crash_core.c

From: John Donnelly <hidden>
Date: 2020-02-24 15:25:56
Also in: linux-arm-kernel, lkml

On Jan 16, 2020, at 9:47 AM, John Donnelly [off-list ref] wrote:


quoted
On Jan 16, 2020, at 9:17 AM, James Morse [off-list ref] wrote:

Hi guys,

On 28/12/2019 09:32, Dave Young wrote:
quoted
On 12/27/19 at 07:04pm, Chen Zhou wrote:
quoted
On 2019/12/27 13:54, Dave Young wrote:
quoted
On 12/23/19 at 11:23pm, Chen Zhou wrote:
quoted
In preparation for supporting reserve_crashkernel_low in arm64 as
x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.

Note, in arm64, we reserve low memory if and only if crashkernel=X,low
is specified. Different with x86_64, don't set low memory automatically.
Do you have any reason for the difference?  I'd expect we have same
logic if possible and remove some of the ifdefs.
In x86_64, if we reserve crashkernel above 4G, then we call reserve_crashkernel_low()
to reserve low memory.

In arm64, to simplify, we call reserve_crashkernel_low() at the beginning of reserve_crashkernel()
and then relax the arm64_dma32_phys_limit if reserve_crashkernel_low() allocated something.
In this case, if reserve crashkernel below 4G there will be 256M low memory set automatically
and this needs extra considerations.
quoted
Sorry that I did not read the old thread details and thought that is
arch dependent.  But rethink about that, it would be better that we can
have same semantic about crashkernel parameters across arches.  If we
make them different then it causes confusion, especially for
distributions.
Surely distros also want one crashkernel* string they can use on all platforms without
having to detect the kernel version, platform or changeable memory layout...

quoted
OTOH, I thought if we reserve high memory then the low memory should be
needed.  There might be some exceptions, but I do not know the exact
one,
quoted
can we make the behavior same, and special case those systems which
do not need low memory reservation.
Its tricky to work out which systems are the 'normal' ones.

We don't have a fixed memory layout for arm64. Some systems have no memory below 4G.
Others have no memory above 4G.

Chen Zhou's machine has some memory below 4G, but its too precious to reserve a large
chunk for kdump. Without any memory below 4G some of the drivers won't work.

I don't see what distros can set as their default for all platforms if high/low are
mutually exclusive with the 'crashkernel=' in use today. How did x86 navigate this, ... or
was it so long ago?

No one else has reported a problem with the existing placement logic, hence treating this
'low' thing as the 'in addition' special case.

Hi,

I am seeing similar  Arm crash dump issues  on  5.4 kernels  where we need  rather large amount of crashkernel memory reserved that is not available below 4GB ( The maximum reserved size appears to be around 768M ) . When I pick memory range higher than 4GB , I see  adapters that fail to initialize :


There is no low-memory  <4G  memory for DMA ;     

[   11.506792] kworker/0:14: page allocation failure: order:0, 
mode:0x104(GFP_DMA32|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 
[   11.518793] CPU: 0 PID: 150 Comm: kworker/0:14 Not tainted 
5.4.0-1948.3.el8uek.aarch64 #1 
[   11.526955] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 
0ACKL025 01/18/2019 
[   11.534948] Workqueue: events work_for_cpu_fn 
[   11.539291] Call trace: 
[   11.541727]  dump_backtrace+0x0/0x18c 
[   11.545376]  show_stack+0x24/0x30 
[   11.548679]  dump_stack+0xbc/0xe0 
[   11.551982]  warn_alloc+0xf0/0x15c 
[   11.555370]  __alloc_pages_slowpath+0xb4c/0xb84 
[   11.559887]  __alloc_pages_nodemask+0x2d0/0x330 
[   11.564405]  alloc_pages_current+0x8c/0xf8 
[   11.568496]  ttm_bo_device_init+0x188/0x220 [ttm] 
[   11.573187]  drm_vram_mm_init+0x58/0x80 [drm_vram_helper] 
[   11.578572]  drm_vram_helper_alloc_mm+0x64/0xb0 [drm_vram_helper] 
[   11.584655]  ast_mm_init+0x38/0x80 [ast] 
[   11.588566]  ast_driver_load+0x474/0xa70 [ast] 
[   11.593029]  drm_dev_register+0x144/0x1c8 [drm] 
[   11.597573]  drm_get_pci_dev+0xa4/0x168 [drm] 
[   11.601919]  ast_pci_probe+0x8c/0x9c [ast] 
[   11.606004]  local_pci_probe+0x44/0x98 
[   11.609739]  work_for_cpu_fn+0x20/0x30 
[   11.613474]  process_one_work+0x1c4/0x41c 
[   11.617470]  worker_thread+0x150/0x4b0 
[   11.621206]  kthread+0x110/0x114 
[   11.624422]  ret_from_fork+0x10/0x18 

This failure is related to a graphics adapter. 

The more complex kdump configurations that use networking stack to NFS mount a filesystem to dump to , or use ssh to copy to another machine,  require more crashkernel memory reservations than perhaps the “default*” settings of  a minimal kdump that creates a minimal  vmcore to local storage in  /var/crash. If crashkernel is too small I get Out of Memory issues and the entire vmcore  process fails. 

( *default kdump setting I assume are a minimal vmcore to /var/crash using primary boot device where /root is located  ) 
Hi Chen,


I was able to unit test these series of kernel  patches  applied to a 5.4.17 test kernel  along with the kexec CLI  change :

0001-arm64-kdump-add-another-DT-property-to-crash-dump-ke.patch

Applied to :

kexec-tools-2.0.19-12.0.4.el8.src.rpm

And obtained a vmcore using this cmdline :

BOOT_IMAGE=(hd6,gpt2)/vmlinuz-5.4.17-4-uek6m_ol8-jpdonnel+ root=/dev/mapper/ol01-root ro crashkernel=2048M@35G crashkernel=250M,low rd.lvm.lv=ol01/root rd.lvm.lv=ol01/swap console=ttyS4 loglevel=7

Can you add :

Tested-by: John Donnelly <redacted>


How can we  get these changes included into an rc kernel release  ?

Thanks,

John.


quoted
quoted
quoted
previous discusses:
	https://urldefense.proofpoint.com/v2/url?u=https-3A__lkml.org_lkml_2019_6_5_670&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=t2fPg9D87F7D8jm0_3CG9yoiIKdRg4qc_thBw4bzMhc&m=jOAu1DTDpohsWszalfTCYx46eGF19TSWVLchN5yBPgk&s=gS9BLOkmj78lP5L7SP6_VLHwvP249uWKaE2R7N7sxgM&e= 
	https://urldefense.proofpoint.com/v2/url?u=https-3A__lkml.org_lkml_2019_6_13_229&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=t2fPg9D87F7D8jm0_3CG9yoiIKdRg4qc_thBw4bzMhc&m=jOAu1DTDpohsWszalfTCYx46eGF19TSWVLchN5yBPgk&s=U1Nis29n3A7XSBzED53fiE4MDAv5NlxYp1UorvvBOOw&e= 
Another concern from James:
"
With both crashk_low_res and crashk_res, we end up with two entries in /proc/iomem called
"Crash kernel". Because its sorted by address, and kexec-tools stops searching when it
find "Crash kernel", you are always going to get the kernel placed in the lower portion.
"

The kexec-tools code is iterating all "Crash kernel" ranges and add them
in an array.  In X86 code, it uses the higher range to locate memory.
Then my hurried reading of what the user-space code does was wrong!

If kexec-tools places the kernel in the low region, there may not be enough memory left
for whatever purpose it was reserved for. This was the motivation for giving it a
different name.


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_kexec&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=t2fPg9D87F7D8jm0_3CG9yoiIKdRg4qc_thBw4bzMhc&m=jOAu1DTDpohsWszalfTCYx46eGF19TSWVLchN5yBPgk&s=bqp02iQDP_Ez-XvLIvj-IPHqbbZwMPlDgmEcG8vhXFE&e= 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_kexec&d=DwIGaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=t2fPg9D87F7D8jm0_3CG9yoiIKdRg4qc_thBw4bzMhc&m=whm9_BOrgAjJvBn0Ey_brHhFg2YMU_P0HF02dhgdgwU&s=vLar_m5JbicYwwuo6N84ZiBDGZUPM8bBLSPLQBtPZNY&e= 
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help