Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-524

Insert plans may repartition unecessarily if numNodes of its input fragment is not set properly.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 1.1
    • Impala 1.1.1
    • None
    • None

    Description

      It appears we don't always set numNodes properly for all PlanNodes which may cause insert plans with few partitions to be incorrectly hash repartitioned.

      Two places that Anty Rao (the bug reporter) mentioned explicitly were:
      1. Exchange Nodes
      2. Aggregation Nodes in the code path without distinct aggregation

      We should also add more Preconditions checks to make sure stats are always set in the proper places.

      Here's the query Anty mentioned:

      INSERT OVERWRITE TABLE tableName
      
      PARTITION (dt='20130729')
             SELECT 
                  c1        ,
                  CASE 
                      WHEN substr(c2,1,2)<>'86' THEN 3
                      WHEN city_id is not null then 1
                      ELSE 2 
                  END as c3,
                  c4,
                  c2      ,
                  sum(c5 + c6),
                  sum(case WHEN statistic_code like '2%' then 1 else 0 end) as success_count
              FROM dw_wap a
              where url is not null
              GROUP BY
                  c1,
                  c3,
                  c4,
                  c2
      ;
      

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            alex.behm Alexander Behm
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: