Thread (3 messages) 3 messages, 2 authors, 2011-10-25

Re: mdadm r/w operations without TEMP_FAILURE_RETRY()

From: NeilBrown <hidden>
Date: 2011-10-25 21:28:43

On Tue, 25 Oct 2011 19:29:52 +0200 Michal Soltys [off-list ref] wrote:
On 11-10-18 11:30, NeilBrown wrote:
quoted
On Tue, 18 Oct 2011 10:16:41 +0100 "Orlowski, Lukasz"
[off-list ref]  wrote:
quoted
Hi,

I was going through mdadm code and got to realize that r/w
operations are invoked without TEMP_FAILURE_RETRY() macro, which
protects from unexpected operation termination, case SIGINT is
thrown. According to my knowledge its POSIX best-practice to call
the r/w operations within that macro, lest some sporadic unexpected
behaviors occur.

Any particular reason for not using it?
I've never heard of TEMP_FAILURE_RETRY.

And having looked in to it I would certainly try to avoid using it.
As this grabbed my attention ..

that macro is just a shortcut to something along the:

do {
	ret = read/write/etc.( ... );
} while (ret < 0 && errno == EINTR);

which has always been the proper way to handle such situations 
(recollecting Stevens books, glibc reference manual, or any other solid 
source). Why avoid using it ? Costs nothing, and guarantees we won't run 
into some corner case.
It is ugly and often unnecessary.
Ugliness without virtue is a real cost.

quoted
If the SA_RESTART flag is set with sigaction() then it should be
totally unnecessary.
signals(7) has pretty large list of when it can or cannot happen, and
when it will always happen regardless of SA_RESTART. And it would be 
quite different list when other unix vendors are considered (which 
doesn't of course apply to mdadm case, it being only linux specific). 
There're also not ignorable stop signals (and under some cases they will 
end with EINTR as well).

And it's not only SIGINT (as the original mail could suggest), any not 
ignored signal can cause it.
Yes, SA_RESTART isn't really a panacea.  SIGSTOP cannot be ignored or blocked
and can have the same effect.

However this only affects system calls that can block (in an interruptible
'S' state, not a non-interruptible 'D' state), and then only if they cannot
complete without returning a valid partial result.

There are very few places where mdadm makes such a system calls.

Some of the ioctl calls on md devices technically behave like this, but are
very unlikely to block in practise and if they do then I probably want them
to fail.

The 'select' calls in msg.c probably should check for EINTR and try again,
but in that case there is already an error check and a loop and I would just
add
  if (rv < 0 && errno == EINTR)
	continue;

rather than add the macro.

So it is certainly worth auditing the code for places where EINTR might be
returned (and being careful in the first place), but blindly applying
TEMP_FAILURE_RETRY() is (in my opinion) wrong.
In a well written program I would expect any place which might return EINTR
to be a place which could also return other errors that suggest a retry is
needed, and the EINTR checking should  just be included with the other
checking.

NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help