Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3706

getBoolVar in FileSinkOperator can be optimized

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.10.0
    • 0.10.0
    • Query Processor
    • None
    • Reviewed

    Description

      There's a call to HiveConf.getBoolVar in FileSinkOperator's processOp method. In benchmarks we found this call to be using ~2% of the CPU time on simple queries, e.g. INSERT OVERWRITE TABLE t1 SELECT * FROM t2;

      This boolean value, a flag to collect the RawDataSize stat, won't change during the processing of a query, so we can determine it at initialization and store that value, saving that CPU.

      Attachments

        1. HIVE-3706.1.patch.txt
          2 kB
          Kevin Wilfong

        Issue Links

          Activity

            People

              kevinwilfong Kevin Wilfong
              kevinwilfong Kevin Wilfong
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: