Thread (9 messages) 9 messages, 2 authors, 2014-09-30
STALE4264d

[PATCH] arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset

From: Shannon Zhao <hidden>
Date: 2014-09-30 01:48:02
Also in: kvm

Hi Christoffer,

On 2014/9/26 21:44, Christoffer Dall wrote:
On Fri, Sep 26, 2014 at 12:16:35PM +0200, Christoffer Dall wrote:
quoted
On Fri, Sep 26, 2014 at 05:26:00PM +0800, Shannon Zhao wrote:
quoted

On 2014/9/26 16:44, Christoffer Dall wrote:
quoted
Hi Shannon,

On Fri, Sep 26, 2014 at 01:57:46PM +0800, Shannon Zhao wrote:
quoted
On 2014/9/26 1:49, Christoffer Dall wrote:
quoted
The sgi values calculated in read_set_clear_sgi_pend_reg() and
write_set_clear_sgi_pend_reg() were horribly incorrectly multiplied by 4
with catastrophic results in that subfunctions ended up overwriting
memory not allocated for the expected purpose.

This showed up as bugs in kfree() and the kernel complaining a lot of
you turn on memory debugging.

This addresses: http://marc.info/?l=kvm&m=141164910007868&w=2

Reported-by: Shannon Zhao <redacted>
Signed-off-by: Christoffer Dall <redacted>
---
 virt/kvm/arm/vgic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index b6fab0f..8629678 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -816,7 +816,7 @@ static bool read_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	int sgi;
-	int min_sgi = (offset & ~0x3) * 4;
+	int min_sgi = (offset & ~0x3);
 	int max_sgi = min_sgi + 3;
 	int vcpu_id = vcpu->vcpu_id;
 	u32 reg = 0;
@@ -837,7 +837,7 @@ static bool write_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	int sgi;
-	int min_sgi = (offset & ~0x3) * 4;
+	int min_sgi = (offset & ~0x3);
 	int max_sgi = min_sgi + 3;
 	int vcpu_id = vcpu->vcpu_id;
 	u32 reg;
Hi Christoffer,

I have test this patch for a few hours. The kfree() bug doesn't appear again.
But I come to another problem as followed.
The test is that start 2 VMs, sleep 10 and do pkill qemu.

qemu-system-aar[1207]: unhandled level 1 permission fault (11) at 0xffffc01ed6c200, esr 0x9200000d
pgd = ffffffc012986000
[ffffc01ed6c200] *pgd=0000000000000000, *pud=0000000000000000

CPU: 1 PID: 1207 Comm: qemu-system-aar Not tainted 3.17.0-rc4+ #1
task: ffffffc87b072900 ti: ffffffc0129e0000 task.ti: ffffffc0129e0000
PC is at 0x4181a0
LR is at 0x41826c
pc : [<00000000004181a0>] lr : [<000000000041826c>] pstate: 80000000
sp : 0000007fcd38ace0
x29: 0000007fcd38ace0 x28: 0000000000000000
x27: 0000000000000000 x26: 0000000000000000
x25: 0000000000000000 x24: 0000000000000000
x23: 0000000000000000 x22: 0000000000000000
x21: 0000000000000000 x20: 0000000000000000
x19: 0000007fcd38b070 x18: 0000007fcd38ab10
x17: 0000007f9bb14480 x16: 00000000009f2370
x15: ffffffffffffffff x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000268
x11: 00000000115e5520 x10: 0101010101010101
x9 : 0000000000000004 x8 : 0000000000ac7a78
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000000
x3 : 0000000000000030 x2 : 0000000000000001
x1 : ffffffc01ed6c200 x0 : ffffffc01ed6c200
Hmmm, I just ran a similar loop with a number of tests in the VM for a
few hours and I didn't see this error.
Yeah, it really need to run longer.
After running about one hour this problem first appears and after running
about 4 hours it second appears.
quoted
In any case, this patch should still be merged, but we should try to
reproduce your setup.
Your patch really solves the kfree() bug. I'll add tested-by line.
quoted
What is your command line, exact QEMU version, the file system you use,
and the guest kernel you are running?
My test script is as followed. QEMU version is v2.1.0 release.
The fs is linaro-image-lamp-genericarmv8-20140727-701.rootfs.tar.gz.
Host kernel is based on marc's branch "kvmtool-vgic-dyn" with your patch
"Fix set_clear_sgi_pend_reg offset".
Guest kernel is 3.16 release.

while true
do
qemu-system-aarch64 \
    -enable-kvm -smp 4 \
    -kernel Image \
    -m 512 -machine virt,kernel_irqchip=on \
    -initrd guestfs.cpio.gz \
    -cpu host \
    -chardev pty,id=pty0,mux=on -monitor chardev:pty0 \
    -serial chardev:pty0 -daemonize \
    -vnc 0.0.0.0:0 \
    -append "rdinit=/sbin/init console=ttyAMA0 mem=512M root=/dev/ram earlyprintk=pl011,0x9000000 rw" &

qemu-system-aarch64 \
    -enable-kvm -smp 4 \
    -kernel Image \
    -m 512 -machine virt,kernel_irqchip=on \
    -initrd guestfs.cpio.gz \
    -cpu host \
    -chardev pty,id=pty0,mux=on -monitor chardev:pty0 \
    -serial chardev:pty0 -daemonize \
    -vnc 0.0.0.0:1 \
    -append "rdinit=/sbin/init console=ttyAMA0 mem=512M root=/dev/ram earlyprintk=pl011,0x9000000 rw" &
        sleep 5
        pkill qemu
ok, I'll try to reproduce.
With kvmarm/queue as both host and guest and otherwise not using vnc but
nographic and a serial output, I've now been running this for 5 hours
straight without any issues. That's 1131 runs (2x number of guests
booted) and counting without seeing this...
I have ran the test with kvmarm/queue as both host and guest using
nographic and a serial output. The problem appears.
My environment info:
kvmarm/queue:
	commit f003101732065c7e61a4d5394cfc69b01b0bb157
	arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc
qemu:
	commit 541bbb07eb197a870661ed702ae1f15c7d46aea6
	Update version for v2.1.0 release
fs:
	guest: linaro-image-minimal-genericarmv8-20140727-701.rootfs.tar.gz
	host:  use above minimal fs and add some libs from linaro-image-lamp-genericarmv8-20140727-701.rootfs.tar.gz

Once this problem appeared, it appears every time when start a vm.
The problem info is always same:
qemu-system-aar[1207]: unhandled level 1 permission fault (11) at 0xffffc01ed6c200, esr 0x9200000d
-Christoffer

.
-- 
Shannon
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help