as a followup to
HIVE-543 - we should have a simple option (enabled by default) to let hive run in local mode if possible.
two levels of options are desirable:
1. hive.exec.mode.local.auto=true/false // control whether local mode is automatically chosen
2. Options to control different heuristics, some naiive examples:
hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode if data > 1G
hive.exec.mode.local.auto.script.enable=true/false // choose if local mode is enabled for queries with user scripts
this can be implemented as a pre/post execution hook. It makes sense to provide this as a standard hook in the hive codebase since it's likely to improve response time for many users (especially for test queries).
the initial proposal is to choose this at a query level and not at per hive-task (ie. hadoop job) level. per job-level requires more changes to compilation (to not pre-commit to hdfs or local scratch directories at compile time).