I am currently working on a search engine that is throughput orientated and works entirely in apache-spark.
As part of this, I need a directory implementation that can operate on HDFS directly. This got me thinking, can I take the one that was worked on so hard for solr hadoop.
As such I migrated the HDFS and blockcache directories out to a lucene-hadoop module.
Having done this work, I am not sure if it is actually a good change, it feels a bit messy, and I dont like how the Metrics class gets extended and abused.