Thread (2 messages) 2 messages, 2 authors, 2009-11-11

Re: Intel Updates SSDs, Supports TRIM, Faster Writes

From: Greg Freemyer <hidden>
Date: 2009-11-11 17:00:15

On Tue, Nov 10, 2009 at 5:56 PM, Martin K. Petersen
[off-list ref] wrote:
quoted
quoted
quoted
quoted
quoted
"Greg" == Greg Freemyer [off-list ref] writes:
Greg> I'm not sure where it ended up, but the big SSD / discard
Greg> discussion of a few months ago talked about 3 kinds of solutions,
Greg> and I thought the plan was to support all 3.

We don't design for the past.


Greg> 1) optimization 1 - A white-listed instant discard feature.  In
Greg>    this methodology, the filesystems would immediately send
Greg>    discard calls down to the block layer would send them on down
Greg>    the block stack to the physical devices with very minimal
Greg>    buffering.

There's no whitelist.  That's just how it works.

Yes, there were a few crappy devices out there.  Windows 7 issuing TRIM
commands in realtime made them instantly obsolete.  If future devices
suck with Windows 7 nobody will buy them.


Greg> 2) optimization 2 - The block layer would accept those small
Greg>    discards, but accumulate them for a short period.  (less than a
Greg>    second was my impression).  Then coalesce them into larger
Greg>    discards and send them down the block stack and eventually to
Greg>    the physical device.

SSDs are special in that they actually track map state on a per-logical
block basis.  Other thinly provisioned devices track space in units
ranging from 16-32-64KB up to megabytes.

It's up to each block device to track the map space.  The way most
arrays work is that they'll ignore the portions of the request that are
not aligned to and a multiple of their internal allocation unit.

The same applies to MD.  IOW, MD would only unmap the portions of the
discard request that constitute entire stripes.  No keeping state
required.

Jens just queued my patch which allows block devices to communicate
their unmap granularity and alignment to the filesystems.  This means we
can potentially use this to influence filesystem allocators.  For SCSI
arrays these values are queried and passed up the stack.  MD can choose
to manually set the granularity to its stripe size.


Greg> 3) optimization 3 - a background freespace scanner would run from
Greg> time to time that scanned a filesystem for free blocks and send a
Greg> discard / trim command down to the device.  This is what Mark Lord
Greg> was working on.  His solution was primarily in user space and was
Greg> controlled by cron.

I think that's a fine approach for legacy devices.  But as I said I
think Windows 7 will root out all devices with poor TRIM performance
pretty quickly.

--
Martin K. Petersen      Oracle Linux Engineering
Martin,

So for a workload mostly composed of small files residing on a MD raid
4/5/6 setup, how is this supposed to work.  (ie. Tiffs, small word
docs, pdfs, individual emails, etc.)

Most of the individual files will be less than one stripe wide, so
when they are deleted I gather the discard range will be less than a
stripe and therefore MD would ignore it in the simplest of
implementations.  ie. Without coalescence at some point, MD will never
forward discards to the hardware.

Thus I would think for that workload, the nightly full freespace scan
and discard would be the best solution.

Thanks
Greg


-- 
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
Preservation and Forensic processing of Exchange Repositories White Paper -
<http://www.norcrossgroup.com/forms/whitepapers/tng_whitepaper_fpe.html>

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help