Thread (41 messages) 41 messages, 15 authors, 2012-01-26

Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?

From: Ric Wheeler <hidden>
Date: 2012-01-19 21:39:53
Also in: linux-fsdevel

On 01/19/2012 04:30 PM, Loke, Chetan wrote:
quoted
From: Ric Wheeler [mailto:rwheeler@redhat.com]
Sent: January 19, 2012 12:44 PM
To: Loke, Chetan
Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
linux-scsi@vger.kernel.org
Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
what'smissing?

On 01/19/2012 12:32 PM, Loke, Chetan wrote:
quoted
quoted
quoted
True, a single front-end won't see all of those LUNs/devices. So
not
quoted
quoted
a
quoted
big concern
about the front-end hosts.

I am thinking of a use-case where folks can use a linux-box to
manage
quoted
quoted
their different storage arrays.
So this linux box with 'libstoragemgmt + app' needs to
manage(scan/create/delete/so on) all those LUNs.
People do have boxes with thousands of luns though&   file systems
in
quoted
quoted
quoted
active use.
Both for SAN and NAS volumes.

One of the challenges is what to do when just one LUN (or NFS
server)
quoted
quoted
crashes
and burns.
The FS needs to go read-only(plain&   simple) because you don't know
what's going on.
You can't risk writing data anymore. Let the apps fail. You can make
it
quoted
happen even today.
It's a simple exercise.
Nope - it needs to be torn down and we need to be able to cleanly
unmount it.

Letting an application see and read-only file system when the disk is
gone or
server down is not very useful since you won't get any non-cached data
back.
Sure, it's just a partial snapshot(aka cached-data) of the file-system.

But writes that have to fetch the non-cached data, will unnecessarily
issue I/O to the fabric. These orphaned I/O's cause more pain in the
cleanup.
And if caching is enabled on the front-side then it's all the more
painful.

We can go one extra step and make FS fail read I/O for non-cached data
too
to avoid more orphan IOs.
I don't really see this as a useful state. Read-only without a real backing file 
system or LUN is hit or miss, that file system should go offline :)
Tearing down will happen sometime later. But don't you agree that
something needs
to happen before that? And that something is, read-only, which will
eventually
propagate to the users(example when you are copying a new file).
Users will then report it to their IT/admins.
This approach of serving the snap-shot(cached) file-system could serve
some users for what it's worth. It's better than surprise-removal and
issuing
needless IOs(read - eh race conditions).
quoted
Also, if you have an ability to migrate that mount (same mount point)
to another
server or clone LUN, you want to unmount the source so you can remount
the data
under that same mount point/namespace....
Won't this be protocol specific.
Not really protocol specific. We need to be able to do a forced unmount and then 
do fail over (that varies depending on many things like your HA frame work and 
certainly the type of thing you are attempting to fail over)

Ric
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help