Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5343

Sort by Column(s) added as part of inserting into Kudu table is incorrect

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Not A Bug
    • Affects Version/s: None
    • Fix Version/s: Impala 2.9.0
    • Component/s: Frontend
    • Labels:
    • Epic Color:
      ghx-label-5

      Description

      The planner is including the KuduPartition(PARTITION_COLUMN) as part of the columns included in the sort by clause, The Sort should match the columns as in the primary key.

      Plan

      Query: explain insert into lineitem_kudu_ts  select * from lineitem_kudu
      | INSERT INTO KUDU [scan_primitives_tpch_3tb.lineitem_kudu_ts]                                                                                                                    |
      | |                                                                                                                                                                               |
      | 02:SORT                                                                                                                                                                         |
      | |  order by: KuduPartition(scan_primitives_tpch_3tb.lineitem_kudu.l_orderkey) ASC NULLS LAST, l_shipdate ASC NULLS LAST, l_orderkey ASC NULLS LAST, l_linenumber ASC NULLS LAST |
      | |                                                                                                                                                                               |
      | 01:EXCHANGE [KUDU(KuduPartition(scan_primitives_tpch_3tb.lineitem_kudu.l_orderkey))]                                                                                            |
      | |                                                                                                                                                                               |
      | 00:SCAN KUDU [scan_primitives_tpch_3tb.lineitem_kudu]                                                                                                                           |
      

      DDL

      [vd1302.halxg.cloudera.com:21000] > show create table scan_primitives_tpch_3tb.lineitem_kudu_ts;
      Query: show create table scan_primitives_tpch_3tb.lineitem_kudu_ts
       CREATE TABLE scan_primitives_tpch_3tb.lineitem_kudu_ts (                                                
         l_shipdate STRING NOT NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                                    
         l_orderkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                      
         l_linenumber BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                    
         l_partkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                       
         l_suppkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                       
         l_quantity DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                          
         l_extendedprice DOUBLE NULL ENCODING PLAIN_ENCODING COMPRESSION LZ4,                                  
         l_discount DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                          
         l_tax DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                               
         l_returnflag STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                                      
         l_linestatus STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                                      
         l_commitdate TIMESTAMP NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                                     
         l_receiptdate STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                                     
         l_shipinstruct STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                                    
         l_shipmode STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                                        
         l_comment STRING NULL ENCODING PLAIN_ENCODING COMPRESSION LZ4,                                        
         PRIMARY KEY (l_shipdate, l_orderkey, l_linenumber)                                                    
       )                                                                                                       
       PARTITION BY HASH (l_orderkey) PARTITIONS 140                                                           
       STORED AS KUDU                                                                                          
       TBLPROPERTIES ('kudu.master_addresses'='vd1301.halxg.cloudera.com:7051,vd1128.halxg.cloudera.com:7051') 
      

        Attachments

          Activity

            People

            • Assignee:
              twmarshall Thomas Tauber-Marshall
              Reporter:
              mmokhtar Mostafa Mokhtar
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: