Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20266

Extra column is being shuffled in cbo as compared to non-cbo

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Query Planning
    • Labels:
      None
    • Target Version/s:

      Description

      CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
      
      explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, value, key as p1, 3 as p2 from src limit 10;
      

      Without CBO

       Map 1
                  Map Operator Tree:
                      TableScan
                        alias: src
                        Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: key (type: string), value (type: string), value (type: string), key (type: string), 3 (type: int)
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4
                          Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                          Limit
                            Number of rows: 10
                            Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                            Reduce Output Operator
                              sort order:
                              Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                              value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
      

      With CBO

      Map 1
                  Map Operator Tree:
                      TableScan
                        alias: src
                        Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: key (type: string), value (type: string), value (type: string), key (type: string)
                          outputColumnNames: _col0, _col1, _col2, _col3
                          Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                          Limit
                            Number of rows: 10
                            Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                            Reduce Output Operator
                              sort order:
                              Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                              value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string)
      

      CBO has 4 columns being shuffled as compared to 3 in non-cbo

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vgarg Vineet Garg
                Reporter:
                vgarg Vineet Garg
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: