Speaking with wesmckinn, it would be really helpful to have Kerberos support in our HDFS logic. This should be straightforward; I would just need to switch to hdfsBuilderConnect() in the shim.
On a side note, is there a reason we aren't using Pivotal's libhdfs3? It uses RPCs natively rather than JNI.
Dask has Python wrappers for this.