We're having issues with timeouts occurring in our client: for some reason, a timeout of 63000 milliseconds is triggered while writing HDFS data. Since we currently have a single-server setup, this results in our client terminating with a "All datanodes are bad" IOException.
We're running all services, including the client, on our single server, so it cannot be a network error. The load on the client is extremely low during this period: only a few kilobytes a minute were being written around the time the error occured.
After browsing a bit online, a lot of people talk about setting "dfs.datanode.socket.write.timeout" to 0 as a solution for this problem. Due to the low load of our system during this period, however, I do feel this is a real error and a timeout that should not be occurring. I have attached 3 logs of the namenode, datanode and client.
It could be that this is related to http://issues.apache.org/jira/browse/HDFS-693
Any pointers on how I can assist to resolve this issue will be greatly appreciated.