Re: [PATCH v3 01/12] fpga: dfl: fme: support 512bit data width PR
From: Greg KH <gregkh@linuxfoundation.org>
Date: 2019-07-24 09:35:37
Also in:
linux-doc, linux-fpga, lkml
On Tue, Jul 23, 2019 at 12:51:24PM +0800, Wu Hao wrote:
quoted hunk ↗ jump to hunk
In early partial reconfiguration private feature, it only supports 32bit data width when writing data to hardware for PR. 512bit data width PR support is an important optimization for some specific solutions (e.g. XEON with FPGA integrated), it allows driver to use AVX512 instruction to improve the performance of partial reconfiguration. e.g. programming one 100MB bitstream image via this 512bit data width PR hardware only takes ~300ms, but 32bit revision requires ~3s per test result. Please note now this optimization is only done on revision 2 of this PR private feature which is only used in integrated solution that AVX512 is always supported. This revision 2 hardware doesn't support 32bit PR. Signed-off-by: Ananda Ravuri <redacted> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <redacted> Acked-by: Alan Tull <atull@kernel.org> Signed-off-by: Moritz Fischer <mdf@kernel.org> --- v2: remove DRV/MODULE_VERSION modifications --- drivers/fpga/dfl-fme-mgr.c | 110 ++++++++++++++++++++++++++++++++++++++------- drivers/fpga/dfl-fme-pr.c | 43 +++++++++++------- drivers/fpga/dfl-fme.h | 2 + drivers/fpga/dfl.h | 5 +++ 4 files changed, 129 insertions(+), 31 deletions(-)diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c index b3f7eee..46e17f0 100644 --- a/drivers/fpga/dfl-fme-mgr.c +++ b/drivers/fpga/dfl-fme-mgr.c@@ -22,6 +22,7 @@ #include <linux/io-64-nonatomic-lo-hi.h> #include <linux/fpga/fpga-mgr.h> +#include "dfl.h" #include "dfl-fme-pr.h" /* FME Partial Reconfiguration Sub Feature Register Set */@@ -30,6 +31,7 @@ #define FME_PR_STS 0x10 #define FME_PR_DATA 0x18 #define FME_PR_ERR 0x20 +#define FME_PR_512_DATA 0x40 /* Data Register for 512bit datawidth PR */ #define FME_PR_INTFC_ID_L 0xA8 #define FME_PR_INTFC_ID_H 0xB0@@ -67,8 +69,43 @@ #define PR_WAIT_TIMEOUT 8000000 #define PR_HOST_STATUS_IDLE 0 +#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512) + +#include <linux/cpufeature.h> +#include <asm/fpu/api.h> + +static inline int is_cpu_avx512_enabled(void) +{ + return cpu_feature_enabled(X86_FEATURE_AVX512F); +}
That's a very arch specific function, why would a driver ever care about this?
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+ kernel_fpu_begin();
+
+ asm volatile("vmovdqu64 (%0), %%zmm0;"
+ "vmovntdq %%zmm0, (%1);"
+ :
+ : "r"(src), "r"(dst)
+ : "memory");
+
+ kernel_fpu_end();
+}Shouldn't this be an arch-specific function somewhere? Burying this in a random driver is not ok. Please make this generic for all systems.
+#else
+static inline int is_cpu_avx512_enabled(void)
+{
+ return 0;
+}
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+ WARN_ON_ONCE(1);Are you trying to get reports from syzbot? :) Please fix this all up. greg k-h