Description
In Hive 0.11, when ORC's OutStream's were flushed they dropped all of the their buffers. In the patch for HIVE-4324, we inadvertently changed that behavior so that one of the buffers is held on to. For queries with a lot of writers and thus under significant memory pressure this can have a significant impact on the memory usage.
Note that "hive.optimize.sort.dynamic.partition" avoids this problem by sorting on the dynamic partition key and thus only a single ORC writer is open at once. This will use memory more effectively and avoid creating ORC files with very small stripes, which will produce better downstream performance.
Attachments
Attachments
Issue Links
- links to