I found another problem here. To explain it, I need to explain how the communication happens now.
1. In the BlockReaderFactory, the DFSClient initiates the file descriptor request by sending:
[2-byte] 28 [DATA_TRANSFER_VERSION]
[1-byte] 87 [REQUEST_SHORT_CIRCUIT_FDS]
[var] OpRequestShortCircuitAccessProto(blk, blockToken, slotId, tracing stuff)
2. On the DataNode, in DataXceiver, we read the OpRequestShortCircuitAccessProto that the client sent. We call DataNode#requestShortCircuitFdsForRead to load the file descriptors. If that succeeded, we send back a BlockOpResponseProto with status SUCCESS.
3. Back in the DFSClient, we read the BlockOpResponseProto.
4. If it contains a SUCCESS response, the DFSClient calls sock.recvFileInputStreams. This reads a single byte and also passes the new file descriptor to us (the DFSClient.)
The problem is that if the DFSClient closes the socket after step #3, but before step #4, the DataNode thinks that the transfer was successful and never unregisters the slot. This is what led to the unit test failures earlier. It seems that there is a buffer in the UNIX domain socket that we are writing to, which lets the DataNode's write succeed immediately even before the DFSClient actually reads the data.
To fix this, we can add a step #5: the DFSClient writes a byte for the DataNode to receive. And step #6: the datanode reads it. That way, if a socket close or other error happens before step #5, we know that the FD didn't get sent.
This can be done compatibly by adding a new boolean to the protobuf which indicates to the DataNode that the client supports "receipt verification." New datanodes will set this bit and old ones will not. Neither the datanode nor the dfsclient will attempt to do receipt verification unless the other party supports it.