Thread (6 messages) 6 messages, 2 authors, 2016-08-25

READ during state recovery uses zero stateid

From: Chuck Lever <chuck.lever@oracle.com>
Date: 2016-08-24 18:11:13

Hi-

I have a wire capture that shows this race while a simple I/O workload is
running:

0. The client reconnects after a network partition
1. The client sends a couple of READ requests
2. The client independently discovers its lease has expired
3. The client establishes a fresh lease
4. The client destroys open, lock, and delegation stateids for the file
that was open under the previous lease
5. The client issues a new OPEN to recover state for that file
6. The server replies to the READs in step 1. with NFS4ERR_EXPIRED
7. The client turns the READs around immediately using the current open
stateid for that file, which is the zero stateid
8. The server replies NFS4_OK to the OPEN from step 5

If I understand the code correctly, if the server happened to send those
READ replies after its OPEN reply (rather than before), the client would
have used the recovered open stateid instead of the zero stateid when
resending the READ requests.

Would it be better if the client recognized there is state recovery in
progress, and then waited for recovery to complete, before retrying the
READs?


--
Chuck Lever


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help