If a user needs to look up a small subset of records quickly, they can use Apache HBase, if they need fast retrieval of larger sets of data, or fast joins, aggregations, they can use Apache Impala. It seems to me that Hive indexes do not serve much of a role in the future of Hive.
Even without moving workloads to other products, columnar file formats with their statistics achieve similar goals as Hive indexes.
Please consider dropping Indexes from the Apache Hive project.
|Investigate usage of IndexPredicateAnalyzer in StorageHandlers||Open||Unassigned|
|Remove index support from metastore||Closed|