Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22538

RS deduplication does not always enforce hive.optimize.reducededuplication.min.reducer

    XMLWordPrintableJSON

Details

    Description

      For transactional tables, that property might be overriden to 1, which can lead to merging final aggregation into a single stage (hence leading to performance degradation). For instance, when autogather column stats is enabled, this can happen for the following query:

      set hive.support.concurrency=true;
      set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
      
      EXPLAIN
      CREATE TABLE x STORED AS ORC TBLPROPERTIES('transactional'='true') AS
      SELECT * FROM SRC x CLUSTER BY x.key;
      

      Attachments

        1. HIVE-22538.patch
          16 kB
          jcamachorodriguez
        2. HIVE-22538.2.patch
          12 kB
          Krisztian Kasa
        3. HIVE-22538.3.patch
          19 kB
          Krisztian Kasa
        4. HIVE-22538.4.patch
          187 kB
          Krisztian Kasa
        5. HIVE-22538.5.patch
          205 kB
          Krisztian Kasa
        6. HIVE-22538.6.patch
          212 kB
          Krisztian Kasa
        7. HIVE-22538.6.patch
          212 kB
          Krisztian Kasa
        8. HIVE-22538.7.patch
          73 kB
          Krisztian Kasa
        9. HIVE-22538.8.patch
          73 kB
          Krisztian Kasa
        10. HIVE-22538.8.patch
          73 kB
          Krisztian Kasa
        11. HIVE-22538.8.patch
          73 kB
          Krisztian Kasa

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              jcamacho Jesús Camacho Rodríguez
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m