Lucene - Core
  1. Lucene - Core
  2. LUCENE-4896

PostingsHighlighter should use a interface of PassageFormatter instead of a class

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2
    • Fix Version/s: 4.3, Trunk
    • Component/s: modules/highlighter
    • Labels:
    • Environment:

      NA

    • Lucene Fields:
      New

      Description

      In my project I need a custom PassageFormatter to use with PostingsHighlighter. I extended PassageFormatter to override format(...)

      but if I do that, I don't have access to the private variables. So instead of changing the scope to protected, it should be more usefull to use a interface for PassageFormatter.

      like public DefaultPassageFormatter implements PassageFormatter.

      1. LUCENE-4896.patch
        16 kB
        Luca Cavanna
      2. LUCENE-4896.patch
        15 kB
        Luca Cavanna
      3. LUCENE-4896.patch
        15 kB
        Luca Cavanna

        Activity

        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.
        Hide
        Robert Muir added a comment -

        Thanks again Luca. I committed this with only one change to your patch (i added some javadocs to append, since its now protected).

        I'm backporting now and then i'll resolve the issue.

        Show
        Robert Muir added a comment - Thanks again Luca. I committed this with only one change to your patch (i added some javadocs to append, since its now protected). I'm backporting now and then i'll resolve the issue.
        Hide
        Luca Cavanna added a comment -

        I see what you mean Robert, thanks a lot for your explanation. I would have probably ended up with an interface + abstract class then

        Let's see what I can come up with for LUCENE-4906...

        Show
        Luca Cavanna added a comment - I see what you mean Robert, thanks a lot for your explanation. I would have probably ended up with an interface + abstract class then Let's see what I can come up with for LUCENE-4906 ...
        Hide
        Robert Muir added a comment -

        Patch looks good... I'm going to apply it and review one last time. Thanks Luca!

        Just to try to explain my reasoning again for the abstract class: in general to me if its the class's "primary purpose" then its the correct way from an inheritance perspective. Whereas interfaces (e.g. Closeable) are not. This way we get the possibility of incorporating default behavior and things like this.

        One reason why this could be useful, when we look at issues like LUCENE-4906, we might want to make formatter more powerful to do this: maybe it has a lower-level method that e.g. gets docids and all kinds of other things, or even all the passages for all the top-docs. The default implementation could still call format() for each doc, but that would allow someone to e.g. build a complex response structure for the whole top-docs at once or something like that.

        I'm not arguing that would necessarily be a good API at all, its just a theoretical example that wouldnt work well with interfaces

        Show
        Robert Muir added a comment - Patch looks good... I'm going to apply it and review one last time. Thanks Luca! Just to try to explain my reasoning again for the abstract class: in general to me if its the class's "primary purpose" then its the correct way from an inheritance perspective. Whereas interfaces (e.g. Closeable) are not. This way we get the possibility of incorporating default behavior and things like this. One reason why this could be useful, when we look at issues like LUCENE-4906 , we might want to make formatter more powerful to do this: maybe it has a lower-level method that e.g. gets docids and all kinds of other things, or even all the passages for all the top-docs. The default implementation could still call format() for each doc, but that would allow someone to e.g. build a complex response structure for the whole top-docs at once or something like that. I'm not arguing that would necessarily be a good API at all, its just a theoretical example that wouldnt work well with interfaces
        Hide
        Luca Cavanna added a comment -

        Here it is.
        The lucene.experimental was already there, I left it only in the abstract class.

        Fixed format javadocs too.

        Show
        Luca Cavanna added a comment - Here it is. The lucene.experimental was already there, I left it only in the abstract class. Fixed format javadocs too.
        Hide
        Luca Cavanna added a comment -

        Just read your last comment, going to add another patch

        Show
        Luca Cavanna added a comment - Just read your last comment, going to add another patch
        Hide
        Luca Cavanna added a comment -

        No problem. It just felt weird to write an abstract class that looked to me like an interface.

        Should be better now. I also made append protected.

        Show
        Luca Cavanna added a comment - No problem. It just felt weird to write an abstract class that looked to me like an interface. Should be better now. I also made append protected.
        Hide
        Robert Muir added a comment -

        One other thing: in the DefaultPassageFormatter we should add @Override annotation to format(), and remove the javadocs block completely: it will be inherited from PassageFormatter.

        I'd also add lucene.experimental annotation to the abstract PassageFormatter class javadocs.

        Show
        Robert Muir added a comment - One other thing: in the DefaultPassageFormatter we should add @Override annotation to format(), and remove the javadocs block completely: it will be inherited from PassageFormatter. I'd also add lucene.experimental annotation to the abstract PassageFormatter class javadocs.
        Hide
        Robert Muir added a comment -

        As i mentioned earlier i think PassageFormatter should be an abstract class (not an interface) with just that one method currently:

        public String format(Passage passages[], String content);
        

        But it might need a more complex API later, e.g. to support LUCENE-4906. So i think an abstract class is the right way to go (in addition to the reasons i mentioned in my first comment)

        opening up the DefaultPassageFormatter looks good in the patch. i think making append protected is fine. In general though its just useful if someone wants to tweak the default implementation. if their impl is radically different they should just be extending PassageFormatter (the abstract one).

        Show
        Robert Muir added a comment - As i mentioned earlier i think PassageFormatter should be an abstract class (not an interface) with just that one method currently: public String format(Passage passages[], String content); But it might need a more complex API later, e.g. to support LUCENE-4906 . So i think an abstract class is the right way to go (in addition to the reasons i mentioned in my first comment) opening up the DefaultPassageFormatter looks good in the patch. i think making append protected is fine. In general though its just useful if someone wants to tweak the default implementation. if their impl is radically different they should just be extending PassageFormatter (the abstract one).
        Hide
        Luca Cavanna added a comment -

        Patch against trunk.

        In this first iteration I made PassageFormatter an interface and created a DefaultPassageFormatter, made also protected its fields.

        Maybe we should make protected the append method too. I can imagine people might want to modify the snippets before adding them to the string builder. The current impl can only escape html, but maybe some people would like to remove html tags at all. Maybe even better to make a proper protected method for that?

        I thought about using an abstract class but wasn't really sure what to make abstract and what to put in it. What do you think Robert?

        Show
        Luca Cavanna added a comment - Patch against trunk. In this first iteration I made PassageFormatter an interface and created a DefaultPassageFormatter, made also protected its fields. Maybe we should make protected the append method too. I can imagine people might want to modify the snippets before adding them to the string builder. The current impl can only escape html, but maybe some people would like to remove html tags at all. Maybe even better to make a proper protected method for that? I thought about using an abstract class but wasn't really sure what to make abstract and what to put in it. What do you think Robert?
        Hide
        Robert Muir added a comment -

        Sebastien Dionne Do you want to create a patch for this issue?

        Show
        Robert Muir added a comment - Sebastien Dionne Do you want to create a patch for this issue?
        Hide
        Robert Muir added a comment -

        I agree there are a few bugs here:

        1. we should split PassageFormatter (abstract) from its default implementation.
        2. the default implementation should expose its params as protected, so its still extensible.

        However I don't think an interface is best for this one: formatting is the key thing this class will do (as opposed to e.g. closeable).
        So I think it should be an abstract class, even if today its api is only one method, i expect this API might unfortunately grow larger

        Show
        Robert Muir added a comment - I agree there are a few bugs here: we should split PassageFormatter (abstract) from its default implementation. the default implementation should expose its params as protected, so its still extensible. However I don't think an interface is best for this one: formatting is the key thing this class will do (as opposed to e.g. closeable). So I think it should be an abstract class, even if today its api is only one method, i expect this API might unfortunately grow larger

          People

          • Assignee:
            Unassigned
            Reporter:
            Sebastien Dionne
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development