Description
We are proposing a new execution model for Hive that is a combination of existing process-based tasks and long-lived daemons running on worker nodes. These nodes can take care of efficient I/O, caching and query fragment execution, while heavy lifting like most joins, ordering, etc. can be handled by tasks.
The proposed model is not a 2-system solution for small and large queries; neither it is a separate execution engine like MR or Tez. It can be used by any Hive execution engine, if support is added; in future even external products (e.g. Pig) can use it.
The document with high-level design we are proposing will be attached shortly.