Thread (389 messages) 389 messages, 13 authors, 2021-08-21

Re: Folios give an 80% performance win

From: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: 2021-07-24 19:21:19
Also in: linux-fsdevel, lkml

On Sat, 2021-07-24 at 19:50 +0100, Matthew Wilcox wrote:
On Sat, Jul 24, 2021 at 11:23:25AM -0700, James Bottomley wrote:
quoted
On Sat, 2021-07-24 at 19:14 +0100, Matthew Wilcox wrote:
quoted
On Sat, Jul 24, 2021 at 11:09:02AM -0700, James Bottomley wrote:
quoted
On Sat, 2021-07-24 at 18:27 +0100, Matthew Wilcox wrote:
quoted
What blows me away is the 80% performance improvement for
PostgreSQL. I know they use the page cache extensively, so
it's
plausibly real. I'm a bit surprised that it has such good
locality, and the size of the win far exceeds my
expectations.  We should probably dive into it and figure out
exactly what's going on.
Since none of the other tested databases showed more than a 3%
improvement, this looks like an anomalous result specific to
something in postgres ... although the next biggest db: mariadb
wasn't part of the tests so I'm not sure that's
definitive.  Perhaps the next step should be to t
est mariadb?  Since they're fairly similar in domain (both full
SQL) if mariadb shows this type of improvement, you can
safely assume it's something in the way SQL databases handle
paging and if it doesn't, it's likely fixing a postgres
inefficiency.
I think the thing that's specific to PostgreSQL is that it's a
heavy user of the page cache.  My understanding is that most
databases use direct IO and manage their own page cache, while
PostgreSQL trusts the kernel to get it right.
That's testable with mariadb, at least for the innodb engine since
the flush_method is settable. 
We're still not communicating well.  I'm not talking about writes,
I'm talking about reads.  Postgres uses the page cache for reads.
InnoDB uses O_DIRECT (afaict).  See articles like this one:
https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/
If it were all about reads, wouldn't the Phoronix pgbench read only
test have shown a better improvement than 7%?  I think the Phoronix
data shows that whatever it is it's to do with writes ... that does
imply something in the way the log syncs data.

James


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help