Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11160 Auto-gather column stats
  3. HIVE-26365

Remove column statistics collection task from merge statement plan

    XMLWordPrintableJSON

Details

    Description

      Merge statements may contain delete and update branches. Update is technically a delete and an insert operation. Column statistics can not be calculated in case of delete operations from the changed records. Example: min, max.

      Currently Hive marks the column stats of the target table invalid after Update/Delete/Merge but for merge extra GBY operators and reducers are generated for insert branches to calculate column stats and Stats works are collecting Column stats too.

      POSTHOOK: query: explain
      merge into acidTbl_n0 as t using nonAcidOrcTbl_n0 s ON t.a = s.a
      WHEN MATCHED AND s.a > 8 THEN DELETE
      WHEN MATCHED THEN UPDATE SET b = 7
      WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b)
      POSTHOOK: type: QUERY
      POSTHOOK: Input: default@acidtbl_n0
      POSTHOOK: Input: default@nonacidorctbl_n0
      POSTHOOK: Output: default@acidtbl_n0
      POSTHOOK: Output: default@acidtbl_n0
      POSTHOOK: Output: default@merge_tmp_table
      STAGE DEPENDENCIES:
        Stage-5 is a root stage
        Stage-6 depends on stages: Stage-5
        Stage-0 depends on stages: Stage-6
        Stage-7 depends on stages: Stage-0
        Stage-1 depends on stages: Stage-6
        Stage-8 depends on stages: Stage-1
        Stage-2 depends on stages: Stage-6
        Stage-9 depends on stages: Stage-2
        Stage-3 depends on stages: Stage-6
        Stage-10 depends on stages: Stage-3
        Stage-4 depends on stages: Stage-6
        Stage-11 depends on stages: Stage-4
      
      STAGE PLANS:
        Stage: Stage-5
          Tez
      #### A masked pattern was here ####
            Edges:
              Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 10 (SIMPLE_EDGE)
              Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
              Reducer 4 <- Reducer 2 (SIMPLE_EDGE)
              Reducer 5 <- Reducer 2 (SIMPLE_EDGE)
              Reducer 6 <- Reducer 5 (CUSTOM_SIMPLE_EDGE)
              Reducer 7 <- Reducer 2 (SIMPLE_EDGE)
              Reducer 8 <- Reducer 7 (CUSTOM_SIMPLE_EDGE)
              Reducer 9 <- Reducer 2 (SIMPLE_EDGE)
      #### A masked pattern was here ####
            Vertices:
              Map 1 
                  Map Operator Tree:
                      TableScan
                        alias: s
                        Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                        Select Operator
                          expressions: a (type: int), b (type: int)
                          outputColumnNames: _col0, _col1
                          Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                          Reduce Output Operator
                            key expressions: _col0 (type: int)
                            null sort order: z
                            sort order: +
                            Map-reduce partition columns: _col0 (type: int)
                            Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                            value expressions: _col1 (type: int)
                  Execution mode: vectorized, llap
                  LLAP IO: all inputs
              Map 10 
                  Map Operator Tree:
                      TableScan
                        alias: t
                        filterExpr: a is not null (type: boolean)
                        Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: a is not null (type: boolean)
                          Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: a (type: int), ROW__ID (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                            outputColumnNames: _col0, _col1
                            Statistics: Num rows: 2 Data size: 160 Basic stats: COMPLETE Column stats: COMPLETE
                            Reduce Output Operator
                              key expressions: _col0 (type: int)
                              null sort order: z
                              sort order: +
                              Map-reduce partition columns: _col0 (type: int)
                              Statistics: Num rows: 2 Data size: 160 Basic stats: COMPLETE Column stats: COMPLETE
                              value expressions: _col1 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                  Execution mode: vectorized, llap
                  LLAP IO: may be used (ACID table)
              Reducer 2 
                  Execution mode: llap
                  Reduce Operator Tree:
                    Merge Join Operator
                      condition map:
                           Left Outer Join 0 to 1
                      keys:
                        0 _col0 (type: int)
                        1 _col0 (type: int)
                      outputColumnNames: _col0, _col1, _col2, _col3
                      Statistics: Num rows: 6 Data size: 288 Basic stats: COMPLETE Column stats: COMPLETE
                      Select Operator
                        expressions: _col3 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>), _col1 (type: int), _col2 (type: int), _col0 (type: int)
                        outputColumnNames: _col0, _col1, _col2, _col3
                        Statistics: Num rows: 6 Data size: 288 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((_col2 = _col3) and (_col3 > 8)) (type: boolean)
                          Statistics: Num rows: 1 Data size: 88 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                            outputColumnNames: _col0
                            Statistics: Num rows: 1 Data size: 76 Basic stats: COMPLETE Column stats: COMPLETE
                            Reduce Output Operator
                              key expressions: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                              null sort order: z
                              sort order: +
                              Map-reduce partition columns: UDFToInteger(_col0) (type: int)
                              Statistics: Num rows: 1 Data size: 76 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((_col2 = _col3) and (_col3 <= 8)) (type: boolean)
                          Statistics: Num rows: 2 Data size: 176 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                            outputColumnNames: _col0
                            Statistics: Num rows: 2 Data size: 152 Basic stats: COMPLETE Column stats: COMPLETE
                            Reduce Output Operator
                              key expressions: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                              null sort order: z
                              sort order: +
                              Map-reduce partition columns: UDFToInteger(_col0) (type: int)
                              Statistics: Num rows: 2 Data size: 152 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((_col2 = _col3) and (_col3 <= 8)) (type: boolean)
                          Statistics: Num rows: 2 Data size: 176 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: _col2 (type: int), 7 (type: int)
                            outputColumnNames: _col0, _col1
                            Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                            Reduce Output Operator
                              key expressions: _col0 (type: int)
                              null sort order: a
                              sort order: +
                              Map-reduce partition columns: _col0 (type: int)
                              Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                              value expressions: _col1 (type: int)
                        Filter Operator
                          predicate: _col2 is null (type: boolean)
                          Statistics: Num rows: 4 Data size: 192 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: _col3 (type: int), _col1 (type: int)
                            outputColumnNames: _col0, _col1
                            Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                            Reduce Output Operator
                              key expressions: _col0 (type: int)
                              null sort order: a
                              sort order: +
                              Map-reduce partition columns: _col0 (type: int)
                              Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                              value expressions: _col1 (type: int)
                        Filter Operator
                          predicate: (_col2 = _col3) (type: boolean)
                          Statistics: Num rows: 3 Data size: 184 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                            outputColumnNames: _col0
                            Statistics: Num rows: 3 Data size: 184 Basic stats: COMPLETE Column stats: COMPLETE
                            Group By Operator
                              aggregations: count()
                              keys: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                              minReductionHashAggr: 0.4
                              mode: hash
                              outputColumnNames: _col0, _col1
                              Statistics: Num rows: 2 Data size: 168 Basic stats: COMPLETE Column stats: COMPLETE
                              Reduce Output Operator
                                key expressions: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                                null sort order: z
                                sort order: +
                                Map-reduce partition columns: _col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                                Statistics: Num rows: 2 Data size: 168 Basic stats: COMPLETE Column stats: COMPLETE
                                value expressions: _col1 (type: bigint)
              Reducer 3 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Select Operator
                      expressions: KEY.reducesinkkey0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                      outputColumnNames: _col0
                      Statistics: Num rows: 1 Data size: 76 Basic stats: COMPLETE Column stats: COMPLETE
                      File Output Operator
                        compressed: false
                        Statistics: Num rows: 1 Data size: 76 Basic stats: COMPLETE Column stats: COMPLETE
                        table:
                            input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                            output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                            serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                            name: default.acidtbl_n0
                        Write Type: DELETE
              Reducer 4 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Select Operator
                      expressions: KEY.reducesinkkey0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                      outputColumnNames: _col0
                      Statistics: Num rows: 2 Data size: 152 Basic stats: COMPLETE Column stats: COMPLETE
                      File Output Operator
                        compressed: false
                        Statistics: Num rows: 2 Data size: 152 Basic stats: COMPLETE Column stats: COMPLETE
                        table:
                            input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                            output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                            serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                            name: default.acidtbl_n0
                        Write Type: DELETE
              Reducer 5 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Select Operator
                      expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: int)
                      outputColumnNames: _col0, _col1
                      Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                      File Output Operator
                        compressed: false
                        Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                        table:
                            input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                            output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                            serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                            name: default.acidtbl_n0
                        Write Type: INSERT
                      Select Operator
                        expressions: _col0 (type: int), _col1 (type: int)
                        outputColumnNames: a, b
                        Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                        Group By Operator
                          aggregations: min(a), max(a), count(1), count(a), compute_bit_vector_hll(a), min(b), max(b), count(b), compute_bit_vector_hll(b)
                          minReductionHashAggr: 0.5
                          mode: hash
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
                          Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                          Reduce Output Operator
                            null sort order: 
                            sort order: 
                            Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                            value expressions: _col0 (type: int), _col1 (type: int), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary), _col5 (type: int), _col6 (type: int), _col7 (type: bigint), _col8 (type: binary)
              Reducer 6 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Group By Operator
                      aggregations: min(VALUE._col0), max(VALUE._col1), count(VALUE._col2), count(VALUE._col3), compute_bit_vector_hll(VALUE._col4), min(VALUE._col5), max(VALUE._col6), count(VALUE._col7), compute_bit_vector_hll(VALUE._col8)
                      mode: mergepartial
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
                      Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                      Select Operator
                        expressions: 'LONG' (type: string), UDFToLong(_col0) (type: bigint), UDFToLong(_col1) (type: bigint), (_col2 - _col3) (type: bigint), COALESCE(ndv_compute_bit_vector(_col4),0) (type: bigint), _col4 (type: binary), 'LONG' (type: string), UDFToLong(_col5) (type: bigint), UDFToLong(_col6) (type: bigint), (_col2 - _col7) (type: bigint), COALESCE(ndv_compute_bit_vector(_col8),0) (type: bigint), _col8 (type: binary)
                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11
                        Statistics: Num rows: 1 Data size: 528 Basic stats: COMPLETE Column stats: COMPLETE
                        File Output Operator
                          compressed: false
                          Statistics: Num rows: 1 Data size: 528 Basic stats: COMPLETE Column stats: COMPLETE
                          table:
                              input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                              output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              Reducer 7 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Select Operator
                      expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: int)
                      outputColumnNames: _col0, _col1
                      Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                      File Output Operator
                        compressed: false
                        Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                        table:
                            input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                            output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                            serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                            name: default.acidtbl_n0
                        Write Type: INSERT
                      Select Operator
                        expressions: _col0 (type: int), _col1 (type: int)
                        outputColumnNames: a, b
                        Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
                        Group By Operator
                          aggregations: min(a), max(a), count(1), count(a), compute_bit_vector_hll(a), min(b), max(b), count(b), compute_bit_vector_hll(b)
                          minReductionHashAggr: 0.75
                          mode: hash
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
                          Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                          Reduce Output Operator
                            null sort order: 
                            sort order: 
                            Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                            value expressions: _col0 (type: int), _col1 (type: int), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary), _col5 (type: int), _col6 (type: int), _col7 (type: bigint), _col8 (type: binary)
              Reducer 8 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Group By Operator
                      aggregations: min(VALUE._col0), max(VALUE._col1), count(VALUE._col2), count(VALUE._col3), compute_bit_vector_hll(VALUE._col4), min(VALUE._col5), max(VALUE._col6), count(VALUE._col7), compute_bit_vector_hll(VALUE._col8)
                      mode: mergepartial
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
                      Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                      Select Operator
                        expressions: 'LONG' (type: string), UDFToLong(_col0) (type: bigint), UDFToLong(_col1) (type: bigint), (_col2 - _col3) (type: bigint), COALESCE(ndv_compute_bit_vector(_col4),0) (type: bigint), _col4 (type: binary), 'LONG' (type: string), UDFToLong(_col5) (type: bigint), UDFToLong(_col6) (type: bigint), (_col2 - _col7) (type: bigint), COALESCE(ndv_compute_bit_vector(_col8),0) (type: bigint), _col8 (type: binary)
                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11
                        Statistics: Num rows: 1 Data size: 528 Basic stats: COMPLETE Column stats: COMPLETE
                        File Output Operator
                          compressed: false
                          Statistics: Num rows: 1 Data size: 528 Basic stats: COMPLETE Column stats: COMPLETE
                          table:
                              input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                              output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              Reducer 9 
                  Execution mode: llap
                  Reduce Operator Tree:
                    Group By Operator
                      aggregations: count(VALUE._col0)
                      keys: KEY._col0 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>)
                      mode: mergepartial
                      outputColumnNames: _col0, _col1
                      Statistics: Num rows: 2 Data size: 168 Basic stats: COMPLETE Column stats: COMPLETE
                      Filter Operator
                        predicate: (_col1 > 1L) (type: boolean)
                        Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: COMPLETE
                        Select Operator
                          expressions: cardinality_violation(_col0) (type: int)
                          outputColumnNames: _col0
                          Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE
                          File Output Operator
                            compressed: false
                            Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE
                            table:
                                input format: org.apache.hadoop.mapred.TextInputFormat
                                output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                                name: default.merge_tmp_table
      
        Stage: Stage-6
          Dependency Collection
      
        Stage: Stage-0
          Move Operator
            tables:
                replace: false
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: default.acidtbl_n0
                Write Type: DELETE
      
        Stage: Stage-7
          Stats Work
            Basic Stats Work:
      
        Stage: Stage-1
          Move Operator
            tables:
                replace: false
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: default.acidtbl_n0
                Write Type: DELETE
      
        Stage: Stage-8
          Stats Work
            Basic Stats Work:
      
        Stage: Stage-2
          Move Operator
            tables:
                replace: false
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: default.acidtbl_n0
                Write Type: INSERT
      
        Stage: Stage-9
          Stats Work
            Basic Stats Work:
      
        Stage: Stage-3
          Move Operator
            tables:
                replace: false
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: default.acidtbl_n0
                Write Type: INSERT
      
        Stage: Stage-10
          Stats Work
            Basic Stats Work:
            Column Stats Desc:
                Columns: a, b
                Column Types: int, int
                Table: default.acidtbl_n0
      
        Stage: Stage-4
          Move Operator
            tables:
                replace: false
                table:
                    input format: org.apache.hadoop.mapred.TextInputFormat
                    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                    name: default.merge_tmp_table
      
        Stage: Stage-11
          Stats Work
            Basic Stats Work:
      

      One of the insert Reducers and the follow-up Reducer for col stats collecting:

              Reducer 5 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Select Operator
                      expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: int)
                      outputColumnNames: _col0, _col1
                      Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                      File Output Operator
                        compressed: false
                        Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                        table:
                            input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                            output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                            serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                            name: default.acidtbl_n0
                        Write Type: INSERT
                      Select Operator
                        expressions: _col0 (type: int), _col1 (type: int)
                        outputColumnNames: a, b
                        Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
                        Group By Operator
                          aggregations: min(a), max(a), count(1), count(a), compute_bit_vector_hll(a), min(b), max(b), count(b), compute_bit_vector_hll(b)
                          minReductionHashAggr: 0.5
                          mode: hash
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
                          Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                          Reduce Output Operator
                            null sort order: 
                            sort order: 
                            Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                            value expressions: _col0 (type: int), _col1 (type: int), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary), _col5 (type: int), _col6 (type: int), _col7 (type: bigint), _col8 (type: binary)
              Reducer 6 
                  Execution mode: vectorized, llap
                  Reduce Operator Tree:
                    Group By Operator
                      aggregations: min(VALUE._col0), max(VALUE._col1), count(VALUE._col2), count(VALUE._col3), compute_bit_vector_hll(VALUE._col4), min(VALUE._col5), max(VALUE._col6), count(VALUE._col7), compute_bit_vector_hll(VALUE._col8)
                      mode: mergepartial
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
                      Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE Column stats: COMPLETE
                      Select Operator
                        expressions: 'LONG' (type: string), UDFToLong(_col0) (type: bigint), UDFToLong(_col1) (type: bigint), (_col2 - _col3) (type: bigint), COALESCE(ndv_compute_bit_vector(_col4),0) (type: bigint), _col4 (type: binary), 'LONG' (type: string), UDFToLong(_col5) (type: bigint), UDFToLong(_col6) (type: bigint), (_col2 - _col7) (type: bigint), COALESCE(ndv_compute_bit_vector(_col8),0) (type: bigint), _col8 (type: binary)
                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11
                        Statistics: Num rows: 1 Data size: 528 Basic stats: COMPLETE Column stats: COMPLETE
                        File Output Operator
                          compressed: false
                          Statistics: Num rows: 1 Data size: 528 Basic stats: COMPLETE Column stats: COMPLETE
                          table:
                              input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                              output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      

      Attachments

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              kkasa Krisztian Kasa
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m