> What I mean is that if you are not getting RST/FIN or any other indication for your closed communication channel, you only left to the mechanism of timeouts to recognize a partitioned/dead/slow worker client.
Timeouts were a red herring in my comment. My problem wasn't with the mere existence of timeouts in corner cases, it was the fact that the worker is assumed to keep working merrily on, despite the timeouts. That's what I don't understand the justification for. If the worker is dead, then it's a non-issue, and the lease can be broken. If the system is alive, the host can discover (via RST, heartbeats, or other timeouts) that the storage system is unreachable, and thus prevent the program from continuing execution -- and at that point the storage service can still break the lease (via a timeout), but it would actually come with a timing-based guarantee that the program will no longer continue execution.
Timeouts were a red herring in my comment. My problem wasn't with the mere existence of timeouts in corner cases, it was the fact that the worker is assumed to keep working merrily on, despite the timeouts. That's what I don't understand the justification for. If the worker is dead, then it's a non-issue, and the lease can be broken. If the system is alive, the host can discover (via RST, heartbeats, or other timeouts) that the storage system is unreachable, and thus prevent the program from continuing execution -- and at that point the storage service can still break the lease (via a timeout), but it would actually come with a timing-based guarantee that the program will no longer continue execution.