Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8699 Enable support for common map join [Spark Branch]
  3. HIVE-8908

Investigate test failure on join34.q [Spark Branch]

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • spark-branch
    • 1.1.0
    • Spark
    • None

    Description

      For this query, the plan doesn't look correct:

      OK
      STAGE DEPENDENCIES:
        Stage-4 is a root stage
        Stage-1 depends on stages: Stage-5, Stage-4
        Stage-2 depends on stages: Stage-1
        Stage-0 depends on stages: Stage-2
        Stage-3 depends on stages: Stage-0
        Stage-5 is a root stage
      
      STAGE PLANS:
        Stage: Stage-4
          Spark
            DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6
            Vertices:
              Map 4 
                  Map Operator Tree:
                      TableScan
                        alias: x
                        Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
                        Filter Operator
                          predicate: key is not null (type: boolean)
                          Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
                          Spark HashTable Sink Operator
                            condition expressions:
                              0 {_col1}
                              1 {value}
                            keys:
                              0 _col0 (type: string)
                              1 key (type: string)
                          Reduce Output Operator
                            key expressions: key (type: string)
                            sort order: +
                            Map-reduce partition columns: key (type: string)
                            Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
                            value expressions: value (type: string)
                  Local Work:
                    Map Reduce Local Work
      
        Stage: Stage-1
          Spark
            Edges:
              Union 2 <- Map 1 (NONE, 0), Map 3 (NONE, 0)
            DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4
            Vertices:
              Map 1 
                  Map Operator Tree:
                      TableScan
                        alias: x
                        Filter Operator
                          predicate: (key < 20) (type: boolean)
                          Select Operator
                            expressions: key (type: string), value (type: string)
                            outputColumnNames: _col0, _col1
                            Map Join Operator
                              condition map:
                                   Inner Join 0 to 1
                              condition expressions:
                                0 {_col1}
                                1 {key} {value}
                              keys:
                                0 _col0 (type: string)
                                1 key (type: string)
                              outputColumnNames: _col1, _col2, _col3
                              input vertices:
                                1 Map 4
                              Select Operator
                                expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string)
                                outputColumnNames: _col0, _col1, _col2
                                File Output Operator
                                  compressed: false
                                  table:
                                      input format: org.apache.hadoop.mapred.TextInputFormat
                                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                                      name: default.dest_j1
                  Local Work:
                    Map Reduce Local Work
              Map 3 
                  Map Operator Tree:
                      TableScan
                        alias: x1
                        Filter Operator
                          predicate: (key > 100) (type: boolean)
                          Select Operator
                            expressions: key (type: string), value (type: string)
                            outputColumnNames: _col0, _col1
                            Map Join Operator
                              condition map:
                                   Inner Join 0 to 1
                              condition expressions:
                                0 {_col1}
                                1 {key} {value}
                              keys:
                                0 _col0 (type: string)
                                1 key (type: string)
                              outputColumnNames: _col1, _col2, _col3
                              input vertices:
                                1 Map 4
                              Select Operator
                                expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string)
                                outputColumnNames: _col0, _col1, _col2
                                File Output Operator
                                  compressed: false
                                  table:
                                      input format: org.apache.hadoop.mapred.TextInputFormat
                                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                                      name: default.dest_j1
                  Local Work:
                    Map Reduce Local Work
              Union 2 
                  Vertex: Union 2
      
        Stage: Stage-2
          Dependency Collection
      
        Stage: Stage-0
          Move Operator
            tables:
                replace: true
                table:
                    input format: org.apache.hadoop.mapred.TextInputFormat
                    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                    name: default.dest_j1
      
        Stage: Stage-3
          Stats-Aggr Operator
      
        Stage: Stage-5
          Spark
            DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:5
            Vertices:
              Map 4 
                  Map Operator Tree:
                      TableScan
                        alias: x
                        Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
                        Filter Operator
                          predicate: key is not null (type: boolean)
                          Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
                          Spark HashTable Sink Operator
                            condition expressions:
                              0 {_col1}
                              1 {value}
                            keys:
                              0 _col0 (type: string)
                              1 key (type: string)
                          Reduce Output Operator
                            key expressions: key (type: string)
                            sort order: +
                            Map-reduce partition columns: key (type: string)
                            Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
                            value expressions: value (type: string)
                  Local Work:
                    Map Reduce Local Work
      
      Time taken: 0.127 seconds, Fetched: 156 row(s)
      

      Note that Stage-4 and Stage-5 are identical. Also, in Stage-4 there's a parallel RS operator with the HTS operator, which is strange.

      Attachments

        1. HIVE-8908.2-spark.patch
          5 kB
          Chao Sun
        2. HIVE-8908.1-spark.patch
          5 kB
          Chao Sun

        Issue Links

          Activity

            People

              csun Chao Sun
              csun Chao Sun
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: