Thread (25 messages) 25 messages, 3 authors, 2019-08-15

Re: [PATCH v3 01/12] fpga: dfl: fme: support 512bit data width PR

From: Greg KH <gregkh@linuxfoundation.org>
Date: 2019-07-24 09:35:37
Also in: linux-doc, linux-fpga, lkml

On Tue, Jul 23, 2019 at 12:51:24PM +0800, Wu Hao wrote:
quoted hunk ↗ jump to hunk
In early partial reconfiguration private feature, it only
supports 32bit data width when writing data to hardware for
PR. 512bit data width PR support is an important optimization
for some specific solutions (e.g. XEON with FPGA integrated),
it allows driver to use AVX512 instruction to improve the
performance of partial reconfiguration. e.g. programming one
100MB bitstream image via this 512bit data width PR hardware
only takes ~300ms, but 32bit revision requires ~3s per test
result.

Please note now this optimization is only done on revision 2
of this PR private feature which is only used in integrated
solution that AVX512 is always supported. This revision 2
hardware doesn't support 32bit PR.

Signed-off-by: Ananda Ravuri <redacted>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <redacted>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: remove DRV/MODULE_VERSION modifications
---
 drivers/fpga/dfl-fme-mgr.c | 110 ++++++++++++++++++++++++++++++++++++++-------
 drivers/fpga/dfl-fme-pr.c  |  43 +++++++++++-------
 drivers/fpga/dfl-fme.h     |   2 +
 drivers/fpga/dfl.h         |   5 +++
 4 files changed, 129 insertions(+), 31 deletions(-)
diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c
index b3f7eee..46e17f0 100644
--- a/drivers/fpga/dfl-fme-mgr.c
+++ b/drivers/fpga/dfl-fme-mgr.c
@@ -22,6 +22,7 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/fpga/fpga-mgr.h>
 
+#include "dfl.h"
 #include "dfl-fme-pr.h"
 
 /* FME Partial Reconfiguration Sub Feature Register Set */
@@ -30,6 +31,7 @@
 #define FME_PR_STS		0x10
 #define FME_PR_DATA		0x18
 #define FME_PR_ERR		0x20
+#define FME_PR_512_DATA		0x40 /* Data Register for 512bit datawidth PR */
 #define FME_PR_INTFC_ID_L	0xA8
 #define FME_PR_INTFC_ID_H	0xB0
 
@@ -67,8 +69,43 @@
 #define PR_WAIT_TIMEOUT   8000000
 #define PR_HOST_STATUS_IDLE	0
 
+#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512)
+
+#include <linux/cpufeature.h>
+#include <asm/fpu/api.h>
+
+static inline int is_cpu_avx512_enabled(void)
+{
+	return cpu_feature_enabled(X86_FEATURE_AVX512F);
+}
That's a very arch specific function, why would a driver ever care about
this?
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+	kernel_fpu_begin();
+
+	asm volatile("vmovdqu64 (%0), %%zmm0;"
+		     "vmovntdq %%zmm0, (%1);"
+		     :
+		     : "r"(src), "r"(dst)
+		     : "memory");
+
+	kernel_fpu_end();
+}
Shouldn't this be an arch-specific function somewhere?  Burying this in
a random driver is not ok.  Please make this generic for all systems.
+#else
+static inline int is_cpu_avx512_enabled(void)
+{
+	return 0;
+}
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+	WARN_ON_ONCE(1);
Are you trying to get reports from syzbot?  :)

Please fix this all up.

greg k-h
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help