Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9660

store end offset of compressed data for RG in RowIndex in ORC

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Right now the end offset is estimated, which in some cases results in tons of extra data being read.
      We can add a separate array to RowIndex (positions_v2?) that stores number of compressed buffers for each RG, or end offset, or something, to remove this estimation magic

      Attachments

        1. HIVE-9660.patch
          138 kB
          Sergey Shelukhin
        2. HIVE-9660.patch
          265 kB
          Sergey Shelukhin
        3. HIVE-9660.01.patch
          273 kB
          Sergey Shelukhin
        4. HIVE-9660.02.patch
          277 kB
          Sergey Shelukhin
        5. HIVE-9660.03.patch
          752 kB
          Sergey Shelukhin
        6. HIVE-9660.04.patch
          742 kB
          Sergey Shelukhin
        7. HIVE-9660.05.patch
          742 kB
          Sergey Shelukhin
        8. HIVE-9660.06.patch
          742 kB
          Sergey Shelukhin
        9. HIVE-9660.07.patch
          742 kB
          Sergey Shelukhin
        10. HIVE-9660.07.patch
          742 kB
          Sergey Shelukhin
        11. HIVE-9660.08.patch
          743 kB
          Sergey Shelukhin
        12. HIVE-9660.09.patch
          743 kB
          Sergey Shelukhin
        13. HIVE-9660.10.patch
          742 kB
          Sergey Shelukhin
        14. HIVE-9660.10.patch
          743 kB
          Sergey Shelukhin
        15. HIVE-9660.11.patch
          742 kB
          Sergey Shelukhin
        16. owen-hive-9660.patch
          84 kB
          Owen O'Malley
        17. HIVE-9660.patch
          335 kB
          Owen O'Malley

        Issue Links

          Activity

            People

              sershe Sergey Shelukhin
              sershe Sergey Shelukhin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m