Re: [PATCH v3 1/2] tpm: fix reference counting for struct tpm_chip
From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2021-02-06 02:45:17
Also in:
lkml, stable
On Fri, Feb 05, 2021 at 10:50:02PM +0100, Lino Sanfilippo wrote:
On 05.02.21 at 16:58, Jason Gunthorpe wrote: eference in the first place).quoted
No, they are all chained together because they are all in the same struct: struct tpm_chip { struct device dev; struct device devs; struct cdev cdev; struct cdev cdevs; dev holds the refcount on memory, when it goes 0 the whole thing is kfreed. The rule is dev's refcount can't go to zero while any other refcount is != 0. For instance devs holds a get on dev that is put back only when devs goes to 0: static void tpm_devs_release(struct device *dev) { struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs); /* release the master device reference */ put_device(&chip->dev); } Both cdev elements do something similar inside the cdev layer.Well this chaining is exactly what does not work nowadays and what the patch is supposed to fix: currently we dont ever take the extra ref (not even in TPM 2 case, note that TPM_CHIP_FLAG_TMP2 is never set), so - if (chip->flags & TPM_CHIP_FLAG_TPM2) - get_device(&chip->dev); + get_device(&chip->dev);
Oh, hah, yes that is busted up. The patch sketch I sent to James is the right way to handle it, feel free to take it up
and tpm_devs_release() is never called, since there is nothing that ever puts devs, so
Yes, that is a pre-existing memory leak
The race with only get_device()/putdevice() in tpm_common_open()/tpm_common_release() is:
The refcount handling is busted up and not working the way it is designed, when that is fixed there is no race. Jason