Pig
  1. Pig
  2. PIG-113

Make Grunt's explain output more understandable

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.1.0
    • Fix Version/s: 0.1.0
    • Component/s: grunt
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      I think it would be better if we can display the execution plan in a more understandable way. One intuitive way to do this is to show output as a tree like in SQL Server.

      Possibly we can have 'AS <format>' as optional argument for explain command

      For example

      Grunt> explain bag1 AS tree ;
      Grunt> explain bag1 AS xml ;
      

      and

      Grunt> explain bag1   
      

      will display the default format

      I have included a patch that does generate tree output.

      Here is a sample of the existing output format

      Logical Plan:
      Group root-Sun Feb 17 19:37:07 GMT+10:00 2008-5
      Object id: 9814147
      Inputs: 26335425 
      Schema: (group, (sum, (), (), ()))
      EvalSpecs:
              Generate: has 2 children
                      Project: (0)
                      Star
      Split root-Sun Feb 17 19:37:07 GMT+10:00 2008-2
      Object id: 25199001
      Inputs: 29132923 
      Schema: (sum, (), (), ())
      EvalSpecs:
      Eval root-Sun Feb 17 19:37:07 GMT+10:00 2008-1
      Object id: 29132923
      Inputs: 10774273 
      Schema: (sum, (), (), ())
      EvalSpecs:
              Generate: has 4 children
                      FuncEval: name: org.apache.pig.impl.builtin.ADD args:
                              Generate: has 2 children
                                      Project: (0)
                                      Project: (1)
                      Project: (0)
                      Project: (1)
                      Project: (2)
      Load root-Sun Feb 17 19:37:07 GMT+10:00 2008-0
      Object id: 10774273
      Inputs: 
      Schema: ()
      EvalSpecs:
      -----------------------------------------------
      Physical Plan:
      MAPREDUCE
      Object id: 17671659
      Inputs: 682933706
      Map: 
              Star
      Grouping Funcs: 
              Generate: has 2 children
                      Project: (0)
                      Star
      Input Files: /tmp/temp678140026/tmp1867058340
      MAPREDUCE
      Object id: 17308974
      Inputs: 
      Map: 
              Composite: has 2 children
                      Star
                      Generate: has 4 children
                              FuncEval: name: org.apache.pig.impl.builtin.ADD args:
                                      Generate: has 2 children
                                              Project: (0)
                                              Project: (1)
                              Project: (0)
                              Project: (1)
                              Project: (2)
      Input Files: /tmp/data1.txt
      Output File: /tmp/temp678140026/tmp1613817084
      

      Here is a sample of my tree output which is more compact and more understandable :-

      grunt> explain c1 as tree ;
      Logical Plan:
      |---LOCogroup ( GENERATE {[PROJECT $0],[*]} ) 
            |---LOSplitOutput (  ) 
                  |---LOSplit ( ([PROJECT $0] < ['5']),([PROJECT $0] >= ['5']) ) 
                        |---LOEval ( GENERATE {[org.apache.pig.impl.builtin.ADD(GENERATE {[PROJECT $0],[PROJECT $1]})],[PROJECT $0],[PROJECT $1],[PROJECT $2]} ) 
                              |---LOLoad ( file = /tmp/data1.txt )
      -----------------------------------------------
      Physical Plan:
      |---POMapreduce
          Map : *
          Grouping : Generate(Project(0),*)
          Input File(s) : /tmp/temp678140026/tmp1867058340
            |---POMapreduce
                Map : Composite(*,Generate(FuncEval(org.apache.pig.impl.builtin.ADD(Generate(Project(0),Project(1)))),Project(0),Project(1),Project(2)))
                Input File(s) : /tmp/data1.txt
      

      I'm also thinking about doing output as xml as it might benefit people who are working on displaying execution plan on GUI.

      1. pig_printtree_1.patch
        22 kB
        Pi Song
      2. pig_printtree_2.patch
        17 kB
        Pi Song

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Pi Song
            Reporter:
            Pi Song
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development