[PIG-113] Make Grunt's explain output more understandable - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.1.0
Fix Version/s: 0.1.0
Component/s: grunt
Labels:
None

Patch Info:

Patch Available

Description

I think it would be better if we can display the execution plan in a more understandable way. One intuitive way to do this is to show output as a tree like in SQL Server.

Possibly we can have 'AS <format>' as optional argument for explain command

For example

Grunt> explain bag1 AS tree ;
Grunt> explain bag1 AS xml ;

and

Grunt> explain bag1

will display the default format

I have included a patch that does generate tree output.

Here is a sample of the existing output format

Logical Plan:
Group root-Sun Feb 17 19:37:07 GMT+10:00 2008-5
Object id: 9814147
Inputs: 26335425 
Schema: (group, (sum, (), (), ()))
EvalSpecs:
        Generate: has 2 children
                Project: (0)
                Star
Split root-Sun Feb 17 19:37:07 GMT+10:00 2008-2
Object id: 25199001
Inputs: 29132923 
Schema: (sum, (), (), ())
EvalSpecs:
Eval root-Sun Feb 17 19:37:07 GMT+10:00 2008-1
Object id: 29132923
Inputs: 10774273 
Schema: (sum, (), (), ())
EvalSpecs:
        Generate: has 4 children
                FuncEval: name: org.apache.pig.impl.builtin.ADD args:
                        Generate: has 2 children
                                Project: (0)
                                Project: (1)
                Project: (0)
                Project: (1)
                Project: (2)
Load root-Sun Feb 17 19:37:07 GMT+10:00 2008-0
Object id: 10774273
Inputs: 
Schema: ()
EvalSpecs:
-----------------------------------------------
Physical Plan:
MAPREDUCE
Object id: 17671659
Inputs: 682933706
Map: 
        Star
Grouping Funcs: 
        Generate: has 2 children
                Project: (0)
                Star
Input Files: /tmp/temp678140026/tmp1867058340
MAPREDUCE
Object id: 17308974
Inputs: 
Map: 
        Composite: has 2 children
                Star
                Generate: has 4 children
                        FuncEval: name: org.apache.pig.impl.builtin.ADD args:
                                Generate: has 2 children
                                        Project: (0)
                                        Project: (1)
                        Project: (0)
                        Project: (1)
                        Project: (2)
Input Files: /tmp/data1.txt
Output File: /tmp/temp678140026/tmp1613817084

Here is a sample of my tree output which is more compact and more understandable :-

grunt> explain c1 as tree ;
Logical Plan:
|---LOCogroup ( GENERATE {[PROJECT $0],[*]} ) 
      |---LOSplitOutput (  ) 
            |---LOSplit ( ([PROJECT $0] < ['5']),([PROJECT $0] >= ['5']) ) 
                  |---LOEval ( GENERATE {[org.apache.pig.impl.builtin.ADD(GENERATE {[PROJECT $0],[PROJECT $1]})],[PROJECT $0],[PROJECT $1],[PROJECT $2]} ) 
                        |---LOLoad ( file = /tmp/data1.txt )
-----------------------------------------------
Physical Plan:
|---POMapreduce
    Map : *
    Grouping : Generate(Project(0),*)
    Input File(s) : /tmp/temp678140026/tmp1867058340
      |---POMapreduce
          Map : Composite(*,Generate(FuncEval(org.apache.pig.impl.builtin.ADD(Generate(Project(0),Project(1)))),Project(0),Project(1),Project(2)))
          Input File(s) : /tmp/data1.txt

I'm also thinking about doing output as xml as it might benefit people who are working on displaying execution plan on GUI.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

pig_printtree_1.patch
19/Feb/08 12:34
22 kB
Pi Song
pig_printtree_2.patch
01/Mar/08 00:50
17 kB
Pi Song

Activity

People

Assignee:: Pi Song

Reporter:: Pi Song

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 19/Feb/08 12:33

Updated:: 24/Mar/10 22:01

Resolved:: 01/Mar/08 04:14