Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16507

Hive Explain User-Level may print out "Vertex dependency in root stage" twice

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0, 2.2.0, 2.3.0
    • 3.0.0
    • None
    • None

    Description

      User-level explain plans have a section titled Vertex dependency in root stage - which (according to the name) prints out the dependencies between all vertices that are in the root stage.

      This logic is controlled by DagJsonParser#print and it may print out Vertex dependency in root stage twice.

      The logic in this method first extracts all stages and plans. It then iterates over all the stages, and if the stage contains any edges, it prints them out.

      If we want to be consistent with the statement Vertex dependency in root stage then we should add a check to see if the stage we are processing during the iteration is the root stage or not.

      Alternatively, we could print out the edges for each stage and change the line from Vertex dependency in root stage to Vertex dependency in [stage-id]

      I'm not sure if its possible for Hive-on-Tez to create a plan with a non-root stage that contains edges, but it is possible for Hive-on-Spark (support added for HoS in HIVE-11133).

      Example for HoS:

      set hive.optimize.ppd=true;
      set hive.ppd.remove.duplicatefilters=true;
      set hive.spark.dynamic.partition.pruning=true;
      set hive.optimize.metadataonly=false;
      set hive.optimize.index.filter=true;
      set hive.strict.checks.cartesian.product=false;
      set hive.spark.explain.user=true;
      set hive.spark.dynamic.partition.pruning=true;
      
      EXPLAIN select count(*) from srcpart where srcpart.ds in (select max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart);
      

      Prints

      Plan optimized by CBO.
      
      Vertex dependency in root stage
      Reducer 10 <- Map 9 (GROUP)
      Reducer 11 <- Reducer 10 (GROUP), Reducer 13 (GROUP)
      Reducer 13 <- Map 12 (GROUP)
      
      Vertex dependency in root stage
      Reducer 2 <- Map 1 (PARTITION-LEVEL SORT), Reducer 6 (PARTITION-LEVEL SORT)
      Reducer 3 <- Reducer 2 (GROUP)
      Reducer 5 <- Map 4 (GROUP)
      Reducer 6 <- Reducer 5 (GROUP), Reducer 8 (GROUP)
      Reducer 8 <- Map 7 (GROUP)
      
      Stage-0
        Fetch Operator
          limit:-1
          Stage-1
            Reducer 3
            File Output Operator [FS_34]
              Group By Operator [GBY_32] (rows=1 width=8)
                Output:["_col0"],aggregations:["count(VALUE._col0)"]
              <-Reducer 2 [GROUP]
                GROUP [RS_31]
                  Group By Operator [GBY_30] (rows=1 width=8)
                    Output:["_col0"],aggregations:["count()"]
                    Join Operator [JOIN_28] (rows=2200 width=10)
                      condition map:[{"":"{\"type\":\"Inner\",\"left\":0,\"right\":1}"}],keys:{"0":"_col0","1":"_col0"}
                    <-Map 1 [PARTITION-LEVEL SORT]
                      PARTITION-LEVEL SORT [RS_26]
                        PartitionCols:_col0
                        Select Operator [SEL_2] (rows=2000 width=10)
                          Output:["_col0"]
                          TableScan [TS_0] (rows=2000 width=10)
                            default@srcpart,srcpart,Tbl:COMPLETE,Col:NONE
                    <-Reducer 6 [PARTITION-LEVEL SORT]
                      PARTITION-LEVEL SORT [RS_27]
                        PartitionCols:_col0
                        Group By Operator [GBY_24] (rows=1 width=184)
                          Output:["_col0"],keys:KEY._col0
                        <-Reducer 5 [GROUP]
                          GROUP [RS_23]
                            PartitionCols:_col0
                            Group By Operator [GBY_22] (rows=2 width=184)
                              Output:["_col0"],keys:_col0
                              Filter Operator [FIL_9] (rows=1 width=184)
                                predicate:_col0 is not null
                                Group By Operator [GBY_7] (rows=1 width=184)
                                  Output:["_col0"],aggregations:["max(VALUE._col0)"]
                                <-Map 4 [GROUP]
                                  GROUP [RS_6]
                                    Group By Operator [GBY_5] (rows=1 width=184)
                                      Output:["_col0"],aggregations:["max(ds)"]
                                      Select Operator [SEL_4] (rows=2000 width=10)
                                        Output:["ds"]
                                        TableScan [TS_3] (rows=2000 width=10)
                                          default@srcpart,srcpart,Tbl:COMPLETE,Col:NONE
                        <-Reducer 8 [GROUP]
                          GROUP [RS_23]
                            PartitionCols:_col0
                            Group By Operator [GBY_22] (rows=2 width=184)
                              Output:["_col0"],keys:_col0
                              Filter Operator [FIL_17] (rows=1 width=184)
                                predicate:_col0 is not null
                                Group By Operator [GBY_15] (rows=1 width=184)
                                  Output:["_col0"],aggregations:["min(VALUE._col0)"]
                                <-Map 7 [GROUP]
                                  GROUP [RS_14]
                                    Group By Operator [GBY_13] (rows=1 width=184)
                                      Output:["_col0"],aggregations:["min(ds)"]
                                      Select Operator [SEL_12] (rows=2000 width=10)
                                        Output:["ds"]
                                        TableScan [TS_11] (rows=2000 width=10)
                                          default@srcpart,srcpart,Tbl:COMPLETE,Col:NONE
              Stage-2
                Reducer 11
      

      So there are two sections that say Vertex dependency in root stage.

      Attachments

        1. HIVE-16507.1.patch
          4 kB
          Sahil Takiar
        2. HIVE-16507.2.patch
          6 kB
          Sahil Takiar

        Issue Links

          Activity

            People

              stakiar Sahil Takiar
              stakiar Sahil Takiar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: