Thread (17 messages) 17 messages, 4 authors, 2016-09-01

Re: 4.7.0-rc7 ext4 error in dx_probe

From: Török Edwin <hidden>
Date: 2016-08-09 07:19:43
Also in: lkml

Possibly related (same subject, not in this thread)

On 2016-08-09 05:37, Darrick J. Wong wrote:
On Tue, Aug 09, 2016 at 12:13:01AM +0300, Török Edwin wrote:
quoted
On 2016-08-08 19:55, Darrick J. Wong wrote:
quoted
On Mon, Aug 08, 2016 at 12:08:18PM -0400, Theodore Ts'o wrote:
quoted
On Sun, Aug 07, 2016 at 11:28:10PM -0700, Darrick J. Wong wrote:
quoted
I have one lingering concern -- is it a bug that two processes could be
computing the checksum of a buffer simultaneously?  I would have thought ext4
would serialize that kind of buffer_head access...
Do we know how this is happening?  We've always depended on the VFS to
provide this exclusion.  The only way we should be modifying the
buffer_head at the same time if two CPU's are trying to modify the
directory at the same time, and that should _never_ be happening, even
with the new directory parallism code, unless the file system has
given permission and intends to do its own fine-grained locking.
It's a combination of two things, I think.  The first is that the
checksum calculation routine (temporarily) set the checksum field to
zero during the computation, which of course is a no-no.  The patch
fixes that problem and should go in.
Thanks a lot for the patch.
I wrote a small testcase (see below) that triggers the problem quite soon on
my box with kernel 4.7.0, and seems to have survived so far with kernel
4.7.0+patch.
When it failed it printed something like "readdir: Bad message".

The drop caches part is quite important for triggering the bug, and might
explain why this bug was hard to reproduce: IIUC this race condition can
happen only if 2+ threads/processes try to access the same directory, and the
directory's inode is not in the cache (i.e. was never cached, or got kicked
out of the cache).
Could you formulate this into an xfstest, please?  It would be very useful to
have this as a regression test.

(Or attach a Signed-off-by and I'll take care of it eventually.)
I've attached a signed-off-by line and a copyright header (feel free to add yourself in the copyright header too):

Signed-off-by: Török Edwin <redacted>
quoted
/*
 $ gcc trigger.c -o trigger -pthread
 $ ./trigger
*/
/*
 * Test concurrent readdir on uncached inodes
 *
 * Copyright (C) 2016 Skylable Ltd.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version 2
 * of the License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
 */
quoted
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#include <unistd.h>
#include <fcntl.h>

#define FILES 100000
#define THREADS 16
#define LOOPS 1000

static void die(const char *msg)
{
	perror(msg);
	exit(EXIT_FAILURE);
}

static void* list(void* arg)
{
	for(int i=0;i<LOOPS;i++) {
		DIR *d = opendir(".");
		if (!d) {
			die("opendir");
		}
		errno = 0;
		while(readdir(d)) {}
		if (errno) {
			die("readdir");
		}
		closedir(d);
		FILE *f = fopen("/proc/sys/vm/drop_caches", "w");
		if (f) {
			fputs("3", f);
			fclose(f);
		}
	}
	return NULL;
}

int main()
{
	pthread_t t[THREADS];

	if(mkdir("ext4test", 0755) < 0 && errno != EEXIST)
		die("mkdir");
	if(chdir("ext4test") < 0)
		die("chdir");
	for (unsigned i=0;i < FILES;i++) {
		char name[16];
		snprintf(name, sizeof(name), "%d", i); 
		int fd = open(name, O_WRONLY|O_CREAT, 0600);
		if (fd < 0)
			die("open");
		close(fd);
	}
	for (unsigned i=0;i < sizeof(t)/sizeof(t[0]); i++) {
		pthread_create(&t[i], NULL,list, NULL);
	}
	for (unsigned i=0;i < sizeof(t)/sizeof(t[0]); i++) {
		pthread_join(t[i], NULL);
	}
	return 0;
}



-- 
Edwin Török | Co-founder and Lead Developer

Skylable open-source object storage: reliable, fast, secure
http://www.skylable.com

-- 
Edwin Török | Co-founder and Lead Developer

Skylable open-source object storage: reliable, fast, secure
http://www.skylable.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help