Pig
  1. Pig
  2. PIG-2663

Expose helpful ScriptState methods

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:
      None

      Description

      The ScriptState.getAlias(MapReduceOper mro) and ScriptState.getFeature(MapReduceOper mro) methods are useful in implementations of PigProgressNotificationListener. I'd like to make them public.

      1. PIG-2663.1.patch
        0.8 kB
        Bill Graham

        Activity

        Hide
        Bill Graham added a comment -

        Initial patch for review.

        Show
        Bill Graham added a comment - Initial patch for review.
        Hide
        Alan Gates added a comment -

        +1

        Show
        Alan Gates added a comment - +1
        Hide
        Julien Le Dem added a comment -

        What about moving getAlias() and getFeature() to instance methods MapReduceOper ?
        those methods do not really belong to ScriptState.
        related: https://issues.apache.org/jira/browse/PIG-2659

        Show
        Julien Le Dem added a comment - What about moving getAlias() and getFeature() to instance methods MapReduceOper ? those methods do not really belong to ScriptState. related: https://issues.apache.org/jira/browse/PIG-2659
        Hide
        Bill Graham added a comment -

        That seems reasonable. So we'd also then move AliasVisitor and FeatureVisitor to MapReduceOper too.

        What about the local cache of MapReduceOper to Aliases and Features that ScriptState holds on to? That would need to stay in ScriptState since that class has the entire plan in scope. Calling those methods would be more efficient then repeatedly calling MapReduceOper.getAlias() since you don't need to re-walk each time. Maybe that's ok.

        Ideally MapReduceOper could lazy load a reference to it's alias and feature when its methods are first called, but that class isn't immutable so I don't think that would fly.

        Show
        Bill Graham added a comment - That seems reasonable. So we'd also then move AliasVisitor and FeatureVisitor to MapReduceOper too. What about the local cache of MapReduceOper to Aliases and Features that ScriptState holds on to? That would need to stay in ScriptState since that class has the entire plan in scope. Calling those methods would be more efficient then repeatedly calling MapReduceOper.getAlias() since you don't need to re-walk each time. Maybe that's ok. Ideally MapReduceOper could lazy load a reference to it's alias and feature when its methods are first called, but that class isn't immutable so I don't think that would fly.
        Hide
        Bill Graham added a comment -

        @Julien I looked some more into moving those methods to MapReduceOper and getAlias is pretty straightforward but getPigFeature poses more issues.

        Currently MapReduceOper has a private OPER_FEATURE enum that it sets internally and uses to answer a dozen or so isFoo methods (i.e. isSampler isGroupBy). ScriptState.getPigFeature calls those MapReduceOper.isFoo methods to then map to a package-private PIG_FEATURE enum.

        If we moved getPigFeature into MapReduceOper we'd probably want to move PIG_FEATURE with it and we'd need to make it public. We'd then have a private OPER_FEATURE and a public PIG_FEATURE in the same class, along with a getPigFeature method and a bunch of isFoo methods. I worry we'd make the MapReduceOper API and it's implementation more confusing, without a significant refactoring. (Or a counter-argument could be made that it's simplifying because we won't have feature logic in two class in two packages. Once class gets more confusing, but logic gets centralized.)

        Thoughts?

        Show
        Bill Graham added a comment - @Julien I looked some more into moving those methods to MapReduceOper and getAlias is pretty straightforward but getPigFeature poses more issues. Currently MapReduceOper has a private OPER_FEATURE enum that it sets internally and uses to answer a dozen or so isFoo methods (i.e. isSampler isGroupBy). ScriptState.getPigFeature calls those MapReduceOper.isFoo methods to then map to a package-private PIG_FEATURE enum. If we moved getPigFeature into MapReduceOper we'd probably want to move PIG_FEATURE with it and we'd need to make it public. We'd then have a private OPER_FEATURE and a public PIG_FEATURE in the same class, along with a getPigFeature method and a bunch of isFoo methods. I worry we'd make the MapReduceOper API and it's implementation more confusing, without a significant refactoring. (Or a counter-argument could be made that it's simplifying because we won't have feature logic in two class in two packages. Once class gets more confusing, but logic gets centralized.) Thoughts?
        Hide
        Dmitriy V. Ryaboy added a comment -

        Sounds like a bigger refactor than we bargained for here.
        I think initial patch is ok. We can do a separate refactoring pass separately.

        Show
        Dmitriy V. Ryaboy added a comment - Sounds like a bigger refactor than we bargained for here. I think initial patch is ok. We can do a separate refactoring pass separately.
        Hide
        Bill Graham added a comment -

        Committed

        Show
        Bill Graham added a comment - Committed

          People

          • Assignee:
            Bill Graham
            Reporter:
            Bill Graham
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development