Thread (5 messages) 5 messages, 2 authors, 2016-09-09

Re: highmem issues with 3.14.10 (LST)

From: Sagar Borikar <hidden>
Date: 2016-09-09 13:17:47

Thanks James.

On Fri, Sep 9, 2016 at 5:36 AM, James Hogan [off-list ref] wrote:
Hi Sagar,

On Thu, Sep 08, 2016 at 08:33:57PM -0700, Sagar Borikar wrote:
quoted
Hello,

I am upgrading kernel for a MIPS Interaptive CPU from 3.10.60 to
3.14.10 (stable) from:
https://www.linux-mips.org/wiki/Malta_Linux_Repository
Unfortunately that wiki page needs updating.

If you're upgrading anyway, I think we'd recommend switching all the way
to a recent mainline kernel release / stable branch, e.g. 4.4 (LTS) or
4.7 (and maybe update to 4.9 (LTS) when it is released or when 4.7 goes
EOL). I think all the stuff you'll need for interAptiv should be in
mainline now anyway.
I see. We generally upgrade to malta repo as its maintained by mips
(imgtec). I presume you are referring to kernel.org. I think linux-mti
is having  4.1.7 as stable, right?
quoted
 The platform has non-contiguous low memory and high memory. After the
upgrade, highmem is not getting enabled due to max_low_pfn and
highend_pfn not being the same.

The commit cce335ae47e231398269fb05fa48e0e9cbf289e0 introduced the
change apparently for sibyte platform. That change doesn't hold good
for all platforms where the high memory and low memory is sparsed.

If I comment out only following change in arch/mips/mm/init.c, highmem
gets initialized properly.

296     if (cpu_has_dc_aliases && max_low_pfn != highend_pfn) {
297         printk(KERN_WARNING "This processor doesn't support highmem."
298                " %ldk highmem ignored\n",
299                (highend_pfn - max_low_pfn) << (PAGE_SHIFT - 10));
300         max_zone_pfns[ZONE_HIGHMEM] = max_low_pfn;
301         lastpfn = max_low_pfn;
302     }
I don't think we ever supported DCache aliasing + highmem in
combination.
Interesting. We are currently running 3.10.60 and apparently it seems
to work. Are you saying it may cause any issues? So far we haven't
seen any problems. What kind of issues it might end up into?
If you want to use that memory your options are probably:
- increase the page size to avoid dcache aliasing.
Ok thanks. I would need to experiment with this but I am bit baffled
how its working in 3.10.60.
Generally, is there any reference platforms based on interaptiv which
uses highmem and dcache aliasing? I might have missed but couldn't
find any platform which comes close in both trees.
- OR use EVA to increase the size of lowmem, which at the moment is a
  bit more involved. How much RAM do you have, and what does your
  physical memory map look like?
Total memory is 2GB. memory map looks like this:

low mem(~66MB) :

 * 0x04300000 +-----------------+
 *                    |     Linux       |
 * 0x00043000 +-----------------+

high mem (128MB):

 * 0x28000000 +-----------------+
 *                   |     linuxhi     |
 * 0x20000000 +-----------------+

Rest of the blocks are reserved.

Thanks
Sagar
Cheers
James
quoted
So wanted to know whether there is additional change required in
platform to work with above codebase.
Secondly, when the system proceeds (with commented code above), it
seems execve causes panic in copy_strings:

Kernel bug detected[#1]:
CPU: 0 PID: 177 Comm: mcp Not tainted 3.14.10 #19
task: 82c99070 ti: 829b0000 task.ti: 829b0000
$ 0   : 00000000 81a40018 00000001 00000528
$ 4   : 806805b0 00000294 00000000 81c76000
$ 8   : 82c99070 fe001ffc 00000000 805d0000
$12   : 00000000 00000000 00000000 00000001
$16   : 8214a760 00000000 81a40010 82c2c580
$20   : ffffffff 7fff7000 00000000 00000008
$24   : 00000000 801182a0
$28   : 829b0000 829b1e78 8214a760 801bb0bc
Hi    : 000000e1
Lo    : 00077c44
epc   : 801bb014 copy_strings+0x304/0x394
    Not tainted
ra    : 801bb0bc copy_strings_kernel+0x18/0x2c
Status: 1100fc03        KERNEL EXL IE
Cause : 10800034
PrId  : 0001a020 (MIPS interAptiv)
Modules linked in:
Process mcp (pid: 177, threadinfo=829b0000, task=82c99070, tls=770b82f0)
Stack : 00000080 00000000 00000000 00000000 00000017 829b1e98 00000000 00000000
          8214a760 82bba0b0 fe001000 00000ff4 80000000 00000080
82bba0b0 81a40000
          80b12b00 00000001 80b12b00 7fe5e66c 81c40000 801bb0bc
80b12b00 82c2c630
          82c2c580 00000080 82c2c580 801bc4d4 00000003 8013452c
7649e000 7648fa08
          82c99234 00000000 00000601 80b12b34 7649e000 7648fa08
7649e000 7fe5dc50
         ...
Call Trace:
[<801bb014>] copy_strings+0x304/0x394
[<801bb0bc>] copy_strings_kernel+0x18/0x2c
[<801bc4d4>] do_execve+0x2fc/0x4c4
[<8010d37c>] handle_sys+0x11c/0x140
Code: 0806ec05  00000000  24020001 <00020336> 0c045e64  02002021
0c0651dd  02002021  0806ec1d
---[ end trace ed487c3c490d886b ]---
BUG: Bad rss-counter state mm:828bd6a0 idx:1 val:2

This panic occurs only when I spawn nested fork/execve. If I spawn the
process directly without nesting, I don't see this panic.

Looks like there are several reports about "Bad rss-counter state"
panic with 3.14-stable. But I couldn't find any concrete solution to
the panic.

Any pointers?

Thanks

Sagar
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help