Solr
  1. Solr
  2. SOLR-367

RFC: use contrete return types in analysis factory declarations

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None

      Description

      will attach patch for comments: the main idea is to change most of the Tokenizer and TokenFilterFactories to be explicit about what their "create" method returns instead of just using "TokenStream"

        Issue Links

          Activity

          Hide
          Hoss Man added a comment -

          Committed revision 591158.

          Show
          Hoss Man added a comment - Committed revision 591158.
          Hide
          Yonik Seeley added a comment -

          Seems fine to me.

          Show
          Yonik Seeley added a comment - Seems fine to me.
          Hide
          Hoss Man added a comment -

          my main motivation for this is a little project i'm working on in some spare time to try and build a reflection based tool that will help us autogenerate Factories whenever we upgrade the Lucene jar. step one is identifying which TokenFilter and Tokenizer factories already exist (even though the names might be inconsistent). for that specific use case i could also introduce some new annotations, but it got me thinking about this and i think it's a good idea in general. It makes the API of each factory self documenting about what that factory produces, and makes hte javadocs a lot easier to follow.

          no functionality is changed, and it is 100% backwards compatible.

          i'm seeking comments before commiting however because it does violate the typical philosophy of an API: declare that you return an interface/abstract class, not a concrete implementation. I'm not sure that is as important in the case of these factories. it also (in theory) could cause problems down the road if we start optimizing the factories. ie: if we decide that for certain input options, SynonymFilterFactory isn't going to return a SynonymFilter and instead will return some new AliasFilter then in theory we have painted ourselves into a corner unless this AliasFilter is a subclass of SynonymFilter because people might have written code that directly calls the create method on the factory and expects a specific return type. likewise we might run into similar problems if we promote some of our own filters up into lucene (and the packages change)

          these seem like minor concerns however, given the intended usecase of these factories, and the "self documentation" benefits of the API.

          ...but that's just my opinion, which is why i wanted to put it out there and see what people think.

          Show
          Hoss Man added a comment - my main motivation for this is a little project i'm working on in some spare time to try and build a reflection based tool that will help us autogenerate Factories whenever we upgrade the Lucene jar. step one is identifying which TokenFilter and Tokenizer factories already exist (even though the names might be inconsistent). for that specific use case i could also introduce some new annotations, but it got me thinking about this and i think it's a good idea in general. It makes the API of each factory self documenting about what that factory produces, and makes hte javadocs a lot easier to follow. no functionality is changed, and it is 100% backwards compatible. i'm seeking comments before commiting however because it does violate the typical philosophy of an API: declare that you return an interface/abstract class, not a concrete implementation. I'm not sure that is as important in the case of these factories. it also (in theory) could cause problems down the road if we start optimizing the factories. ie: if we decide that for certain input options, SynonymFilterFactory isn't going to return a SynonymFilter and instead will return some new AliasFilter then in theory we have painted ourselves into a corner unless this AliasFilter is a subclass of SynonymFilter because people might have written code that directly calls the create method on the factory and expects a specific return type. likewise we might run into similar problems if we promote some of our own filters up into lucene (and the packages change) these seem like minor concerns however, given the intended usecase of these factories, and the "self documentation" benefits of the API. ...but that's just my opinion, which is why i wanted to put it out there and see what people think.

            People

            • Assignee:
              Hoss Man
              Reporter:
              Hoss Man
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development