Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20266

Extra column is being shuffled in cbo as compared to non-cbo

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Query Planning
    • None

    Description

      CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
      
      explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, value, key as p1, 3 as p2 from src limit 10;
      

      Without CBO

       Map 1
                  Map Operator Tree:
                      TableScan
                        alias: src
                        Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: key (type: string), value (type: string), value (type: string), key (type: string), 3 (type: int)
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4
                          Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                          Limit
                            Number of rows: 10
                            Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                            Reduce Output Operator
                              sort order:
                              Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                              value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
      

      With CBO

      Map 1
                  Map Operator Tree:
                      TableScan
                        alias: src
                        Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: key (type: string), value (type: string), value (type: string), key (type: string)
                          outputColumnNames: _col0, _col1, _col2, _col3
                          Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE
                          Limit
                            Number of rows: 10
                            Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                            Reduce Output Operator
                              sort order:
                              Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE
                              value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string)
      

      CBO has 4 columns being shuffled as compared to 3 in non-cbo

      Attachments

        Issue Links

          Activity

            People

              vgarg Vineet Garg
              vgarg Vineet Garg
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: