Re: Intel Updates SSDs, Supports TRIM, Faster Writes
From: Greg Freemyer <hidden>
Date: 2009-11-11 17:00:15
On Tue, Nov 10, 2009 at 5:56 PM, Martin K. Petersen [off-list ref] wrote:
quoted
quoted
quoted
quoted
quoted
"Greg" == Greg Freemyer [off-list ref] writes:Greg> I'm not sure where it ended up, but the big SSD / discard Greg> discussion of a few months ago talked about 3 kinds of solutions, Greg> and I thought the plan was to support all 3. We don't design for the past. Greg> 1) optimization 1 - A white-listed instant discard feature. In Greg> this methodology, the filesystems would immediately send Greg> discard calls down to the block layer would send them on down Greg> the block stack to the physical devices with very minimal Greg> buffering. There's no whitelist. That's just how it works. Yes, there were a few crappy devices out there. Windows 7 issuing TRIM commands in realtime made them instantly obsolete. If future devices suck with Windows 7 nobody will buy them. Greg> 2) optimization 2 - The block layer would accept those small Greg> discards, but accumulate them for a short period. (less than a Greg> second was my impression). Then coalesce them into larger Greg> discards and send them down the block stack and eventually to Greg> the physical device. SSDs are special in that they actually track map state on a per-logical block basis. Other thinly provisioned devices track space in units ranging from 16-32-64KB up to megabytes. It's up to each block device to track the map space. The way most arrays work is that they'll ignore the portions of the request that are not aligned to and a multiple of their internal allocation unit. The same applies to MD. IOW, MD would only unmap the portions of the discard request that constitute entire stripes. No keeping state required. Jens just queued my patch which allows block devices to communicate their unmap granularity and alignment to the filesystems. This means we can potentially use this to influence filesystem allocators. For SCSI arrays these values are queried and passed up the stack. MD can choose to manually set the granularity to its stripe size. Greg> 3) optimization 3 - a background freespace scanner would run from Greg> time to time that scanned a filesystem for free blocks and send a Greg> discard / trim command down to the device. This is what Mark Lord Greg> was working on. His solution was primarily in user space and was Greg> controlled by cron. I think that's a fine approach for legacy devices. But as I said I think Windows 7 will root out all devices with poor TRIM performance pretty quickly. -- Martin K. Petersen Oracle Linux Engineering
Martin, So for a workload mostly composed of small files residing on a MD raid 4/5/6 setup, how is this supposed to work. (ie. Tiffs, small word docs, pdfs, individual emails, etc.) Most of the individual files will be less than one stripe wide, so when they are deleted I gather the discard range will be less than a stripe and therefore MD would ignore it in the simplest of implementations. ie. Without coalescence at some point, MD will never forward discards to the hardware. Thus I would think for that workload, the nightly full freespace scan and discard would be the best solution. Thanks Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer Preservation and Forensic processing of Exchange Repositories White Paper - <http://www.norcrossgroup.com/forms/whitepapers/tng_whitepaper_fpe.html> The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html