I'm using the pyarrow HDFS client in a long running (forever) app that makes connections to HDFS (via libhdfs) as external requests come in and destroys the connection as soon as the request is handled. This happens a large amount of times on separate threads and everything works great.
The problem is, after the app idles for a while (perhaps hours) and no HDFS connections are made during this time, when the next connection is attempted, it hangs. No exceptions are thrown. As soon as I restart my python app, the HDFS connection works just fine again.
I'm using the precompiled libhdfs.so directly from the hadoop-3.0.3 distribution. Do I typically need to recompile libhdfs.so for my OS, or is the one out of the box typically fine?
I've checked with the Arrow community first- they've recommended I check with the Hadoop community since all the pyarrow client does is pass through the commands to libhdfs.
Any suggestions on debugging this hanging issue would be appreciated.