Pig
  1. Pig
  2. PIG-3307

Refactor physical operators to remove methods parameters that are always null

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available
    • Hadoop Flags:
      Reviewed

      Description

      The physical operators are sometimes overly complex. I'm trying to cleanup some unnecessary code.
      in particular there is an array of getNext(T v) where the value v does not seem to have any importance and is just used to pick the correct method.
      I have started a refactoring for a more readable getNext*T*().

      1. PIG-3307_0.patch
        169 kB
        Julien Le Dem
      2. PIG-3307_1.patch
        224 kB
        Julien Le Dem
      3. PIG-3307_2.patch
        430 kB
        Julien Le Dem
      4. PIG-3307_3.patch
        430 kB
        Julien Le Dem

        Activity

        Hide
        Julien Le Dem added a comment -

        PIG-3307_0.patch contains the initial refactoring

        Show
        Julien Le Dem added a comment - PIG-3307 _0.patch contains the initial refactoring
        Hide
        Julien Le Dem added a comment -

        PIG-3307_1.patch introduces some more refactoring

        Show
        Julien Le Dem added a comment - PIG-3307 _1.patch introduces some more refactoring
        Hide
        Julien Le Dem added a comment -

        It looks like we can get rid of the parameter that is only used for method dispatch.
        I will replace all calls to getNext(Tuple t) to getNextTuple() in PhysicalOperator.

        Show
        Julien Le Dem added a comment - It looks like we can get rid of the parameter that is only used for method dispatch. I will replace all calls to getNext(Tuple t) to getNextTuple() in PhysicalOperator.
        Hide
        Daniel Dai added a comment -

        Any performance implication for this change?

        Show
        Daniel Dai added a comment - Any performance implication for this change?
        Hide
        Julien Le Dem added a comment -

        PIG-3307_2.patch removes the unused parameter in getNext(*)

        Show
        Julien Le Dem added a comment - PIG-3307 _2.patch removes the unused parameter in getNext(*)
        Hide
        Julien Le Dem added a comment -

        Daniel Dai This is removing parameters that were not used. I have not tested performance but I think it could only improve performance.
        (see latest patch PIG-3307_2.patch)

        Show
        Julien Le Dem added a comment - Daniel Dai This is removing parameters that were not used. I have not tested performance but I think it could only improve performance. (see latest patch PIG-3307 _2.patch)
        Hide
        Daniel Dai added a comment -

        That's what I am expecting. Would love to see some performance data with this approach.

        Show
        Daniel Dai added a comment - That's what I am expecting. Would love to see some performance data with this approach.
        Hide
        Julien Le Dem added a comment -

        Daniel Daiwhat's the recommended approach? If you have a setup to do perf test that would be helpful.

        Show
        Julien Le Dem added a comment - Daniel Dai what's the recommended approach? If you have a setup to do perf test that would be helpful.
        Hide
        Julien Le Dem added a comment -

        Daniel Dai Also most likely it wont make any difference performance wise.

        Show
        Julien Le Dem added a comment - Daniel Dai Also most likely it wont make any difference performance wise.
        Hide
        Julien Le Dem added a comment -

        It would be nice if someone could take a look before too many changes get in.

        Show
        Julien Le Dem added a comment - It would be nice if someone could take a look before too many changes get in.
        Hide
        Cheolsoo Park added a comment -

        Do you mind uploading the patch to the RB?

        I will also run the unit tests with your patch.

        Show
        Cheolsoo Park added a comment - Do you mind uploading the patch to the RB? I will also run the unit tests with your patch.
        Show
        Julien Le Dem added a comment - https://reviews.apache.org/r/11203/diff/#index_header thanks Cheolsoo Park and Daniel Dai !
        Hide
        Cheolsoo Park added a comment -

        I am giving my +1 because this makes code much cleaner.

        I haven't done any performance benchmarks because I agree with Julien's assessment. But please advise if anyone has concerns.

        Show
        Cheolsoo Park added a comment - I am giving my +1 because this makes code much cleaner. I haven't done any performance benchmarks because I agree with Julien's assessment. But please advise if anyone has concerns.
        Hide
        Julien Le Dem added a comment -

        PIG-3307_3.patch addresses Cheolsoo Park's comments

        Show
        Julien Le Dem added a comment - PIG-3307 _3.patch addresses Cheolsoo Park 's comments
        Hide
        Julien Le Dem added a comment -

        committed to TRUNK
        Committed revision 1484037

        Show
        Julien Le Dem added a comment - committed to TRUNK Committed revision 1484037

          People

          • Assignee:
            Julien Le Dem
            Reporter:
            Julien Le Dem
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development