Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-2350

Improve RCFile Read Speed

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      By tweaking the RCFile$Reader implementation to allow more efficient memory access I was able to reduce CPU usage. I measured the speed required to scan a gzipped RCFile, decompress and assemble into records. CPU time was reduced by about 7% for a full table scan, An improvement of about 2% was realised when a smaller subset of columns (3-5 out of tens) were selected.

      Attachments

        1. rcfile-2011-08-04.diff
          8 kB
          Tim Armstrong
        2. rcfile_opt_2011-08-05.diff
          8 kB
          Tim Armstrong
        3. rcfile_opt_2011-08-05b.diff
          11 kB
          Tim Armstrong
        4. rcfile_opt_2011-08-11.patch
          11 kB
          Tim Armstrong
        5. rcfile_opt_2011-08-11.patch
          11 kB
          Tim Armstrong

        Activity

          People

            tarmstrong Tim Armstrong
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: