Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
From: Ric Wheeler <hidden>
Date: 2012-01-19 21:39:53
Also in:
linux-fsdevel
On 01/19/2012 04:30 PM, Loke, Chetan wrote:
quoted
From: Ric Wheeler [mailto:rwheeler@redhat.com] Sent: January 19, 2012 12:44 PM To: Loke, Chetan Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com; Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? On 01/19/2012 12:32 PM, Loke, Chetan wrote:quoted
quoted
quoted
True, a single front-end won't see all of those LUNs/devices. Sonotquoted
quoted
aquoted
big concern about the front-end hosts. I am thinking of a use-case where folks can use a linux-box tomanagequoted
quoted
their different storage arrays. So this linux box with 'libstoragemgmt + app' needs to manage(scan/create/delete/so on) all those LUNs.People do have boxes with thousands of luns though& file systemsinquoted
quoted
quoted
active use. Both for SAN and NAS volumes. One of the challenges is what to do when just one LUN (or NFSserver)quoted
quoted
crashes and burns.The FS needs to go read-only(plain& simple) because you don't know what's going on. You can't risk writing data anymore. Let the apps fail. You can makeitquoted
happen even today. It's a simple exercise.Nope - it needs to be torn down and we need to be able to cleanly unmount it. Letting an application see and read-only file system when the disk is gone or server down is not very useful since you won't get any non-cached data back.Sure, it's just a partial snapshot(aka cached-data) of the file-system. But writes that have to fetch the non-cached data, will unnecessarily issue I/O to the fabric. These orphaned I/O's cause more pain in the cleanup. And if caching is enabled on the front-side then it's all the more painful. We can go one extra step and make FS fail read I/O for non-cached data too to avoid more orphan IOs.
I don't really see this as a useful state. Read-only without a real backing file system or LUN is hit or miss, that file system should go offline :)
Tearing down will happen sometime later. But don't you agree that something needs to happen before that? And that something is, read-only, which will eventually propagate to the users(example when you are copying a new file). Users will then report it to their IT/admins. This approach of serving the snap-shot(cached) file-system could serve some users for what it's worth. It's better than surprise-removal and issuing needless IOs(read - eh race conditions).quoted
Also, if you have an ability to migrate that mount (same mount point) to another server or clone LUN, you want to unmount the source so you can remount the data under that same mount point/namespace....Won't this be protocol specific.
Not really protocol specific. We need to be able to do a forced unmount and then do fail over (that varies depending on many things like your HA frame work and certainly the type of thing you are attempting to fail over) Ric