Re: [Lsf-pc] [LSF/MM TOPIC] block level event logging for storage media management

From: Jan Kara <jack@suse.cz>
Date: 2017-01-25 09:56:28

On Tue 24-01-17 15:18:57, Oleg Drokin wrote:

On Jan 23, 2017, at 2:27 AM, Dan Williams wrote:

quoted

[ adding Oleg ]

On Sun, Jan 22, 2017 at 10:00 PM, Song Liu [off-list ref] wrote:

quoted

Hi Dan,

I think the the block level event log is more like log only system. When en event
happens,  it is not necessary to take immediate action. (I guess this is different
to bad block list?).

I would hope the event log to track more information. Some of these individual
event may not be very interesting, for example, soft error or latency outliers.
However, when we gather event log for a fleet of devices, these "soft event"
may become valuable for health monitoring.

I'd be interested in this. It sounds like you're trying to fill a gap
between tracing and console log messages which I believe others have
encountered as well.

We have a somewhat similar problem problem in Lustre and I guess it's not
just Lustre.  Currently there are all sorts of conditional debug code all
over the place that goes to the console and when you enable it for
anything verbose, you quickly overflow your dmesg buffer no matter the
size, that might be mostly ok for local "block level" stuff, but once you
become distributed, it start to be a mess and once you get to be super
large it worsens even more since you need to somehow coordinate data from
multiple nodes, ensure all of it is not lost and still you don't end up
using a lot of it since only a few nodes end up being useful.  (I don't
know how NFS people manage to debug complicated issues using just this,
could not be super easy).

Having some sort of a buffer of a (potentially very) large size that
could be storing the data until it's needed, or eagerly polled by some
daemon for storage (helpful when you expect a lot of data that definitely
won't fit in RAM).

Tracepoints have the buffer and the daemon, but creating new messages is
very cumbersome, so converting every debug message into one does not look
very feasible.  Also it's convenient to have "event masks" one want
logged that I don't think you could do with tracepoints.

So creating trace points IMO isn't that cumbersome. I agree that converting
hundreds or thousands debug printks into tracepoints is a pain in the
ass but still it is doable. WRT filtering, you can enable each tracepoint
individually. Granted that is not exactly the 'event mask' feature you ask
about but that can be easily scripted in userspace if you give some
structure to tracepoint names. Finally tracepoints provide a fine grained
control you never get with printk - e.g. you can make a tracepoint trigger
only if specific inode is involved with trace filters which greatly reduces
the amount of output.

								Honza
-- 
Jan Kara [off-list ref]
SUSE Labs, CR

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help