Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15290

Stripe size smaller than specified.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.2.0, 1.2.1, 2.0.0, 2.0.1, 2.1.0
    • None
    • ORC
    • None

    Description

      In Hive-1.2.0, the real stripe size of output orc file will be very small if most of table data are empty, result in too many Column Statistics objects consumes most of the memory.
      I found it become better in Hive-2.0.1, but the stripe size still much smaller than expected.
      I saw there's a Jira item: https://issues.apache.org/jira/browse/HIVE-13232 moved the compressed = null out of if block, this changes helps a lot, but for completely fix this, another change is needed in `OutStream.getBufferSize()`
      I've created the PR:
      https://github.com/apache/hive/pull/118
      Please take a look.

      Attachments

        Activity

          People

            Unassigned Unassigned
            melode11 Yuxing Yao
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: