Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Scenario: we want to hard-limit (within the constraints imposed by using Java) the memory used by a particular Hive task dedicated to ORC writing, to protect other tasks from misbehaving queries. This is similar to how we e.g. limit the memory used for hash join - when the hash table goes over the limit, the task fails.
However, we currently cannot even hard-limit this for a single writer, much less for several writers combined, when they are writing.
I wonder if it's possible to add two features to MemoryManager:
1) Grouping writers. A tag can be supplied externally (e.g. when creating the writer).
2) Hard-limiting the memory by tag - if the group exceeds the memory allowance, all the corresponding writers should be made to fail on next operation, via the callback.