I've been over on elephantbird [0] as I was trying out some things with the aim in mind of utilizing the elephantbird I/O formats for obtaining data locality in gora-lucene.
dvryaboy suggested that we would be nice to pursue the option of persisting Parquet files in HDFS... something which renato2099 mentioned a number of years ago to me as well!!!
Maybe we can extend support for this in 0.6.