Thread (20 messages) 20 messages, 4 authors, 2020-06-11

Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

From: Dan Williams <hidden>
Date: 2020-06-10 15:11:53
Also in: lkml, nvdimm, oe-kbuild-all

On Wed, Jun 10, 2020 at 5:10 AM Vaibhav Jain [off-list ref] wrote:
Dan Williams [off-list ref] writes:
quoted
On Tue, Jun 9, 2020 at 10:54 AM Vaibhav Jain [off-list ref] wrote:
quoted
Thanks Dan for the consideration and taking time to look into this.

My responses below:

Dan Williams [off-list ref] writes:
quoted
On Mon, Jun 8, 2020 at 5:16 PM kernel test robot [off-list ref] wrote:
quoted
Hi Vaibhav,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on linus/master v5.7 next-20200605]
[cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r016-20200607 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e429cffd4f228f70c1d9df0e5d77c08590dd9766)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <redacted>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from <built-in>:1:
quoted
quoted
./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
struct nd_cmd_pkg hdr;  /* Package header containing sub-cmd */
Hi Vaibhav,
[.]
quoted
This looks like it's going to need another round to get this fixed. I
don't think 'struct nd_pdsm_cmd_pkg' should embed a definition of
'struct nd_cmd_pkg'. An instance of 'struct nd_cmd_pkg' carries a
payload that is the 'pdsm' specifics. As the code has it now it's
defined as a superset of 'struct nd_cmd_pkg' and the compiler warning
is pointing out a real 'struct' organization problem.

Given the soak time needed in -next after the code is finalized this
there's no time to do another round of updates and still make the v5.8
merge window.
Agreed that this looks bad, a solution will probably need some more
review cycles resulting in this series missing the merge window.

I am investigating into the possible solutions for this reported issue
and made few observations:

I see command pkg for Intel, Hpe, Msft and Hyperv families using a
similar layout of embedding nd_cmd_pkg at the head of the
command-pkg. struct nd_pdsm_cmd_pkg is following the same pattern.

struct nd_pdsm_cmd_pkg {
    struct nd_cmd_pkg hdr;
    /* other members */
};

struct ndn_pkg_msft {
    struct nd_cmd_pkg gen;
    /* other members */
};
struct nd_pkg_intel {
    struct nd_cmd_pkg gen;
    /* other members */
};
struct ndn_pkg_hpe1 {
    struct nd_cmd_pkg gen;
    /* other members */
[.]
quoted
In those cases the other members are a union and there is no second
variable length array. Perhaps that is why those definitions are not
getting flagged? I'm not seeing anything in ndctl build options that
would explicitly disable this warning, but I'm not sure if the ndctl
build environment is missing this build warning by accident.
I tried building ndctl master with clang-10 with CC=clang and
CFLAGS="". Seeing the same warning messages reported for all command
package struct for existing command families.

./hpe1.h:334:20: warning: field 'gen' with variable sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
        struct nd_cmd_pkg gen;
                          ^
./msft.h:59:20: warning: field 'gen' with variable sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
        struct nd_cmd_pkg       gen;
                                ^
./hyperv.h:34:20: warning: field 'gen' with variable sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
        struct nd_cmd_pkg       gen;
                                ^
Good to know, but ugh now I'm just realizing this warning is only
coming from clang and not gcc. Frankly I'm not as concerned about
clang warnings and I should have been more careful looking at the
source of this warning.
quoted
Those variable size payloads are also not being used in any code paths
that would look at the size of the command payload, like the kernel
ioctl() path. The payload validation code needs static sizes and the
payload parsing code wants to cast the payload to a known type. I
don't think you can use the same struct definition for both those
cases which is why the ndctl parsing code uses the union layout, but
the kernel command marshaling code does strict layering.
Even if I switch to union layout and replacing the flexible array 'payload'
at end to a fixed size array something like below, I still see
'-Wgnu-variable-sized-type-not-at-end' warning reported by clang:

union nd_pdsm_cmd_payload {
        struct nd_papr_pdsm_health health;
        __u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
};

struct nd_pdsm_cmd_pkg {
        struct nd_cmd_pkg hdr;  /* Package header containing sub-cmd */
        __s32 cmd_status;       /* Out: Sub-cmd status returned back */
        __u16 reserved[2];      /* Ignored and to be used in future */
        union nd_pdsm_cmd_payload payload;
} __attribute__((packed));
Even though this is a clang warning, I'm still not sure it's a good
idea to copy the ndctl approach into the kernel. Could you perhaps
handle this the way that drivers/acpi/nfit/intel.c handles submitting
commands through the ND_CMD_CALL interface, i.e. by just defining the
command locally like this (from intel_security_flags()):

        struct {
                struct nd_cmd_pkg pkg;
                struct nd_intel_get_security_state cmd;
        } nd_cmd = {
                .pkg = {
                        .nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
                        .nd_family = NVDIMM_FAMILY_INTEL,
                        .nd_size_out =
                                sizeof(struct nd_intel_get_security_state),
                        .nd_fw_size =
                                sizeof(struct nd_intel_get_security_state),
                },
        };

That way it's clear that the payload is 'struct
nd_intel_get_security_state' without needing to have a pre-existing
definition. For parsing in the ioctl path I think it's clearer to cast
the payload to the local pdsm structure for the command.
quoted
quoted
};

Even though other command families implement similar command-package
layout they were not flagged (yet) as they are (I am guessing) serviced
in vendor acpi drivers rather than in kernel like in case of papr-scm
command family.
I sincerely hope there are no vendor acpi kernel drivers outside of
the upstream one.
Apologies if I was not clear. Was referring to nvdimm vendor uefi
drivers which ultimately service the DSM commands. Since CMD_CALL serves
as a conduit to send the command payload to these vendor drivers,
libnvdimm never needs to peek into the nd_cmd_pkg.payload
field. Consequently nfit module never hit this warning in kernel before.
Ah, understood, and no, that's not the root reason this problem is not
present in the kernel. The expectation is that any payload that the
kernel would need to consider should probably have a kernel specific
translation defined. For example,

        ND_CMD_GET_CONFIG_SIZE
        ND_CMD_GET_CONFIG_DATA
        ND_CMD_SET_CONFIG_DATA

...are payloads that the kernel needs to understand. However instead
of supporting each way to read / write the label area the expectation
is that all drivers just parse this common kernel payload and
translate it if necessary. For example ND_CMD_{GET,SET}_CONFIG_DATA is
optionally translated to the Intel DSMs, generic ACPI _LSR/_LSW, or
papr_scm_meta_{get,set}.

Outside of validating command numbers the expectation is that the
kernel does not validate/consume the contents of the ND_CMD_CALL
payload, it passes it to the backend where ACPI DSM or pdsm protocol
takes over.
quoted
quoted
So, I think this issue is not just specific to papr-scm command family
introduced in this patch series but rather across all other command
families. Every other command family assumes 'struct nd_cmd_pkg_hdr' to
be embeddable and puts it at the beginning of their corresponding
command-packages. And its only a matter of time when someone tries
filtering/handling of vendor specific commands in nfit module when they
hit similar issue.

Possible Solutions:

* One way would be to redefine 'struct nd_cmd_pkg' to mark field
  'nd_payload[]' from a flexible array to zero sized array as
  'nd_payload[0]'.
I just went through a round of removing the usage of buf[0] in ndctl
since gcc10 now warns about that too.
quoted
This should make 'struct nd_cmd_pkg' embeddable and
  clang shouldn't report 'gnu-variable-sized-type-not-at-end'
  warning. Also I think this change shouldn't introduce any ABI change.

* Another way to solve this issue might be to redefine 'struct
  nd_pdsm_cmd_pkg' to below removing the 'struct nd_cmd_pkg' member. This
  struct should immediately follow the 'struct nd_cmd_pkg' command package
  when sent to libnvdimm:

  struct nd_pdsm_cmd_pkg {
        __s32 cmd_status;       /* Out: Sub-cmd status returned back */
        __u16 reserved[2];      /* Ignored and to be used in future */
        __u8 payload[];
        };

  This should remove the flexible member nc_cmd_pkg.nd_payload from the
  struct with just one remaining at the end. However this would make
  accessing the [in|out|fw]_size members of 'struct nd_cmd_pkg'
  difficult for the pdsm servicing functions.


Any other solution that you think, may solve this issue ?
The union might help, but per the above I think only for parsing the
command at which point I don't think the kernel needs a unified
structure defining both the generic envelope and the end-point
specific payload at once.
As I tested above, switching to union too will not solve the clang
warning.

Having a unified structure for a command family defining both
generic envelop and end-point specific payload, is what I see all the
existing command families doing.

However if I split 'struct nd_pdsm_cmd_pkg' to not have an embedded
'struct nd_cmd_pkg' then it goes opposite to what existing command family
implementations.

So to me it looks like no clear way to address this :-(

Another non-conventional way to fix this might be to suppress this clang
warning messages by adding "CFLAGS_papr_scm.o +=
-Wno-gnu-variable-sized-type-not-at-end" to papr_scm Makefile.
No, I don't think it's appropriate to customize clang warnings. Have a
look at splitting parsing vs local command submission following the
approach taken in drivers/acpi/nfit/intel.c.
Current implementation 'struct nd_cmd_pkg' clearly depends on gcc
extension of having a flexible payload array which is allowed to be not
necessarily placed at the end of a containing struct. So the problem can be
attributed to difference in compiler implementations between GCC and
Clang rather than how 'struct nd_pdsm_cmd_pkg' and 'struct nd_cmd_pkg'
are laid out.

--
Cheers
~ Vaibhav
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help