Pig
  1. Pig
  2. PIG-2583

Add Grunt command to list the statements in cache

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Add new grunt command:
      history [-n]: Display the list statements in cache. -n means hiding line numbers.
      Show
      Add new grunt command: history [-n]: Display the list statements in cache. -n means hiding line numbers.

      Description

      It is convenient to list statements in cache:

      grunt> a = load '1.txt';
      grunt> b = foreach a generate $0, $1;
      grunt> list
      a = load '1.txt';
      b = foreach a generate $0, $1;

      1. gruntHistory4.patch
        0.7 kB
        Allan Avendaño
      2. gruntHistory3.patch
        5 kB
        Allan Avendaño
      3. gruntHistory2.patch
        4 kB
        Allan Avendaño
      4. gruntHistory1.patch
        5 kB
        Allan Avendaño
      5. gruntHistory.patch
        4 kB
        Allan Avendaño

        Activity

        Daniel Dai created issue -
        Hide
        Bill Graham added a comment -

        Great idea. What about history instead of list? It's similar to bash and groovy shell. I couldn't find an irb equivalent.

        Show
        Bill Graham added a comment - Great idea. What about history instead of list ? It's similar to bash and groovy shell. I couldn't find an irb equivalent.
        Hide
        Daniel Dai added a comment -

        Sounds good.

        Show
        Daniel Dai added a comment - Sounds good.
        Hide
        Allan Avendaño added a comment -

        I added an arraylist that works as cache of statements executed so far.

        Show
        Allan Avendaño added a comment - I added an arraylist that works as cache of statements executed so far.
        Allan Avendaño made changes -
        Field Original Value New Value
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.10.0 [ 12316246 ]
        Allan Avendaño made changes -
        Attachment gruntHistory.patch [ 12525054 ]
        Hide
        Daniel Dai added a comment -

        Thanks, I will take a look.

        Show
        Daniel Dai added a comment - Thanks, I will take a look.
        Daniel Dai made changes -
        Assignee Allan Avendaño [ xalan ]
        Hide
        Daniel Dai added a comment -

        It is right to put in command cache only if it is a Pig statement not a command. However, if an alias is get reused, we should show the last definition, not all. I think we should make use of Graph.scriptCache to do it.

        Show
        Daniel Dai added a comment - It is right to put in command cache only if it is a Pig statement not a command. However, if an alias is get reused, we should show the last definition, not all. I think we should make use of Graph.scriptCache to do it.
        Hide
        Allan Avendaño added a comment -

        I was working on this, but I found something particular if we show the last definition for the alias, for example on this sequence:

        A = load 'data/abs' using PigStorage('|') as (name:chararray, id1:int, id2:int, id3:int);
        D = group A by id1;
        A = load 'data/abs' using PigStorage('|') as (name:chararray, x:int, y:int, z:int);

        With modifications on the patch that I attach, the following history is shown (according to historical modifications on aliases):

        D = group A by id1;
        A = load 'data/abs' using PigStorage('|') as (name:chararray, x:int, y:int, z:int);

        But, D will work on previous schema of A, and not with the last definition. And maybe, we are missing this other historical point of view.

        Show
        Allan Avendaño added a comment - I was working on this, but I found something particular if we show the last definition for the alias, for example on this sequence: A = load 'data/abs' using PigStorage('|') as (name:chararray, id1:int, id2:int, id3:int); D = group A by id1; A = load 'data/abs' using PigStorage('|') as (name:chararray, x:int, y:int, z:int); With modifications on the patch that I attach, the following history is shown (according to historical modifications on aliases): D = group A by id1; A = load 'data/abs' using PigStorage('|') as (name:chararray, x:int, y:int, z:int); But, D will work on previous schema of A, and not with the last definition. And maybe, we are missing this other historical point of view.
        Allan Avendaño made changes -
        Attachment gruntHistory1.txt [ 12526053 ]
        Allan Avendaño made changes -
        Attachment gruntHistory1.txt [ 12526053 ]
        Allan Avendaño made changes -
        Attachment gruntHistory1.patch [ 12526054 ]
        Hide
        Gianmarco De Francisci Morales added a comment -

        Probably it is fine anyway.
        The order in which they are defined (and printed) will tell you that D got defined before A.

        One use case I see for this history command is to build a Pig script interactively step by step.
        The output of the history can be then copy-pasted to a file to save the script.
        In this case you would need to repeat the D = ... statement anyway to get the correct order.
        To do this for more than one statement you could simply use the output of history, and copy-paste it in grunt up to (excluding) the last A = ... statement.

        Show
        Gianmarco De Francisci Morales added a comment - Probably it is fine anyway. The order in which they are defined (and printed) will tell you that D got defined before A. One use case I see for this history command is to build a Pig script interactively step by step. The output of the history can be then copy-pasted to a file to save the script. In this case you would need to repeat the D = ... statement anyway to get the correct order. To do this for more than one statement you could simply use the output of history, and copy-paste it in grunt up to (excluding) the last A = ... statement.
        Hide
        Prashant Kommireddi added a comment -

        Should we let the behavior be similar to standard "history" command on UNIX shell?

        localhost:~ pkommireddi$ pwd
        /Users/pkommireddi
        localhost:~ pkommireddi$ date
        Tue May  8 15:26:26 PDT 2012
        localhost:~ pkommireddi$ whoami
        pkommireddi
        localhost:~ pkommireddi$ pwd
        /Users/pkommireddi
        
        localhost:~ pkommireddi$ history
          541  pwd
          542  date
          543  whoami
          544  pwd
          545  history
        

        It might be useful for users while developing/debugging to be able to look at all declarations of an alias.

        Show
        Prashant Kommireddi added a comment - Should we let the behavior be similar to standard "history" command on UNIX shell? localhost:~ pkommireddi$ pwd /Users/pkommireddi localhost:~ pkommireddi$ date Tue May 8 15:26:26 PDT 2012 localhost:~ pkommireddi$ whoami pkommireddi localhost:~ pkommireddi$ pwd /Users/pkommireddi localhost:~ pkommireddi$ history 541 pwd 542 date 543 whoami 544 pwd 545 history It might be useful for users while developing/debugging to be able to look at all declarations of an alias.
        Hide
        Daniel Dai added a comment -

        I take back the "last definition" part. If people keep reusing the same alias, we shall keep the complete history:
        A = load ...
        A = foreach A ...
        A = filter A by ...

        It is possible to detect complete dead statement, but I don't want to complicate the implementation.

        However, we shall reuse scriptCache instead of introducing a new data structure.

        Show
        Daniel Dai added a comment - I take back the "last definition" part. If people keep reusing the same alias, we shall keep the complete history: A = load ... A = foreach A ... A = filter A by ... It is possible to detect complete dead statement, but I don't want to complicate the implementation. However, we shall reuse scriptCache instead of introducing a new data structure.
        Hide
        Allan Avendaño added a comment -

        History of operators used so far. Is used Graph.scriptCache

        Show
        Allan Avendaño added a comment - History of operators used so far. Is used Graph.scriptCache
        Allan Avendaño made changes -
        Attachment gruntHistory2.txt [ 12526163 ]
        Allan Avendaño made changes -
        Attachment gruntHistory2.txt [ 12526163 ]
        Hide
        Allan Avendaño added a comment -

        History of operators used so far. Is used Graph.scriptCache

        Show
        Allan Avendaño added a comment - History of operators used so far. Is used Graph.scriptCache
        Allan Avendaño made changes -
        Attachment gruntHistory2.patch [ 12526164 ]
        Hide
        Daniel Dai added a comment -

        Looks good. Another request however, we shall print line number, but we can optionally turn it off (history -n sounds good?)

        Show
        Daniel Dai added a comment - Looks good. Another request however, we shall print line number, but we can optionally turn it off (history -n sounds good?)
        Hide
        Allan Avendaño added a comment -

        Something like this?

        grunt> history -n
        (1) A = load 'data/abs' using PigStorage('|') as (name:chararray, id1:int, id2:int, id3:int);
        (2) B = group A by id1;
        (3) C = order A by id2;

        grunt> history
        A = load 'data/abs' using PigStorage('|') as (name:chararray, id1:int, id2:int, id3:int);
        B = rank A by id1;
        C = order A by id2;

        Show
        Allan Avendaño added a comment - Something like this? grunt> history -n (1) A = load 'data/abs' using PigStorage('|') as (name:chararray, id1:int, id2:int, id3:int); (2) B = group A by id1; (3) C = order A by id2; grunt> history A = load 'data/abs' using PigStorage('|') as (name:chararray, id1:int, id2:int, id3:int); B = rank A by id1; C = order A by id2;
        Hide
        Daniel Dai added a comment -

        Fine for me.

        Show
        Daniel Dai added a comment - Fine for me.
        Hide
        Prashant Kommireddi added a comment -

        Not a huge deal, but again replicating Unix 'history' behavior (no parenthesis) would be ideal in my opinion.

        pkommireddi@pkommireddi-wsl:~$ pwd
        /home/pkommireddi
        pkommireddi@pkommireddi-wsl:~$ date
        Wed May  9 17:01:37 PDT 2012
        pkommireddi@pkommireddi-wsl:~$ pwd
        /home/pkommireddi
        pkommireddi@pkommireddi-wsl:~$ whoami
        pkommireddi
        pkommireddi@pkommireddi-wsl:~$ history
        
         2989  pwd
         2990  date
         2991  pwd
         2992  whoami
        
        Show
        Prashant Kommireddi added a comment - Not a huge deal, but again replicating Unix 'history' behavior (no parenthesis) would be ideal in my opinion. pkommireddi@pkommireddi-wsl:~$ pwd /home/pkommireddi pkommireddi@pkommireddi-wsl:~$ date Wed May 9 17:01:37 PDT 2012 pkommireddi@pkommireddi-wsl:~$ pwd /home/pkommireddi pkommireddi@pkommireddi-wsl:~$ whoami pkommireddi pkommireddi@pkommireddi-wsl:~$ history 2989 pwd 2990 date 2991 pwd 2992 whoami
        Hide
        Daniel Dai added a comment -

        Yes, that's better, and let's make show line number the default option.

        Show
        Daniel Dai added a comment - Yes, that's better, and let's make show line number the default option.
        Allan Avendaño made changes -
        Attachment gruntHistory3.patch [ 12526312 ]
        Hide
        Allan Avendaño added a comment -

        Last patch shows by default an enumerated history, this can be omitted with "-n"

        Show
        Allan Avendaño added a comment - Last patch shows by default an enumerated history, this can be omitted with "-n"
        Hide
        Daniel Dai added a comment -

        +1. Patch committed to trunk. Thanks Allan!

        Show
        Daniel Dai added a comment - +1. Patch committed to trunk. Thanks Allan!
        Daniel Dai made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Release Note Add new grunt command:
        history [-n]: Display the list statements in cache. -n means hiding line numbers.
        Resolution Fixed [ 1 ]
        Hide
        Prashant Kommireddi added a comment -

        Thanks Allan for the contribution!

        Show
        Prashant Kommireddi added a comment - Thanks Allan for the contribution!
        Hide
        Allan Avendaño added a comment -

        I would like to make some changes on the format to show the list of statements. For example, I make it to display the statements with an indentation like history command does on Unix.

        Show
        Allan Avendaño added a comment - I would like to make some changes on the format to show the list of statements. For example, I make it to display the statements with an indentation like history command does on Unix.
        Allan Avendaño made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Allan Avendaño added a comment -

        Changes on the displaying format of list of statements used so far.

        Show
        Allan Avendaño added a comment - Changes on the displaying format of list of statements used so far.
        Allan Avendaño made changes -
        Attachment gruntHistory4.patch [ 12547962 ]
        Hide
        Julien Le Dem added a comment -

        Please open a new ticket for this and resolve this one

        Show
        Julien Le Dem added a comment - Please open a new ticket for this and resolve this one
        Hide
        Julien Le Dem added a comment -

        Allan Avendaño I'm closing this ticket as it has been committed.
        Please open a new ticket to further improve your contribution.
        Thanks again

        Show
        Julien Le Dem added a comment - Allan Avendaño I'm closing this ticket as it has been committed. Please open a new ticket to further improve your contribution. Thanks again
        Julien Le Dem made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Bill Graham made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Allan Avendaño
            Reporter:
            Daniel Dai
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development