Lucene - Core
  1. Lucene - Core
  2. LUCENE-4440

FilterCodec should take a delegate Codec in its ctor

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 4.0-BETA
    • Fix Version/s: 4.0, 4.1, 6.0
    • Component/s: core/codecs
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      FilterCodec has a delegate() method through which an extension can return its delegate Codec. This method is called on every Codec method. Adrien, on LUCENE-4391, failed to pass a Codec in the ctor, since he couldn't called Codec.forName().

      Instead, we should just pass e.g. new Lucene40Codec(). I'll post a patch shortly.

      1. LUCENE-4440.patch
        11 kB
        Uwe Schindler
      2. LUCENE-4440.patch
        7 kB
        Shai Erera

        Issue Links

          Activity

          Hide
          Shai Erera added a comment -

          A very trivial patch. Removes delegate() and takes a Codec in FilterCodec ctor. The changes to the 3 classes that extend it are simple.

          Show
          Shai Erera added a comment - A very trivial patch. Removes delegate() and takes a Codec in FilterCodec ctor. The changes to the 3 classes that extend it are simple.
          Hide
          Shai Erera added a comment -

          if possible, I'd like to commit it to 4.0 too, just so that whoever extends FilterCodec in 4.0 doesn't need to change his code when he upgrades to 4.1. And the change is really trivial.

          Show
          Shai Erera added a comment - if possible, I'd like to commit it to 4.0 too, just so that whoever extends FilterCodec in 4.0 doesn't need to change his code when he upgrades to 4.1. And the change is really trivial.
          Hide
          Robert Muir added a comment -

          This is also similar to what i suggested on LUCENE-4391, see Adrien's comment.

          If we do this, we should probably add a NOTE/WARNING somewhere that tells you
          you should not call Codec.forName in your no-arg constructor.

          e.g. dont call super("mycodec", Codec.forName(xxxx))

          Show
          Robert Muir added a comment - This is also similar to what i suggested on LUCENE-4391 , see Adrien's comment. If we do this, we should probably add a NOTE/WARNING somewhere that tells you you should not call Codec.forName in your no-arg constructor. e.g. dont call super("mycodec", Codec.forName(xxxx))
          Hide
          Adrien Grand added a comment -

          If we do this, we should probably add a NOTE/WARNING somewhere that tells you you should not call Codec.forName in your no-arg constructor.

          +1

          Show
          Adrien Grand added a comment - If we do this, we should probably add a NOTE/WARNING somewhere that tells you you should not call Codec.forName in your no-arg constructor. +1
          Hide
          Adrien Grand added a comment -

          Maybe we should also test loader against null in Codec.forName to throw a better exception than a NPE whenever initialization issues happen (or even better fix this issue, but I am not sure it is possible...).

          Show
          Adrien Grand added a comment - Maybe we should also test loader against null in Codec.forName to throw a better exception than a NPE whenever initialization issues happen (or even better fix this issue, but I am not sure it is possible...).
          Hide
          Shai Erera added a comment -

          I can add to Codec.forName() something like <b>NOTE:</b> do not call this method from a Codec's ctor !. What do you think?

          Show
          Shai Erera added a comment - I can add to Codec.forName() something like <b>NOTE:</b> do not call this method from a Codec's ctor ! . What do you think?
          Hide
          Uwe Schindler added a comment -

          This is the same chicken-and-egg problem the ICU4J project had with one Japanese Locale. The problem here is, that Codec.forName() cannot be used before SPI loading is finished. The problem here is that SPI loads all codec names and creates instances from them. But at the time of creating those instances, the lookup does not yet work.

          I agree we should add a null check in SPILoader's lookup method (not Codec.forName) to throw IllegalStateException("Cannot requests codec until service provider interface has finished loading them").

          Show
          Uwe Schindler added a comment - This is the same chicken-and-egg problem the ICU4J project had with one Japanese Locale. The problem here is, that Codec.forName() cannot be used before SPI loading is finished. The problem here is that SPI loads all codec names and creates instances from them. But at the time of creating those instances, the lookup does not yet work. I agree we should add a null check in SPILoader's lookup method (not Codec.forName) to throw IllegalStateException("Cannot requests codec until service provider interface has finished loading them").
          Hide
          Robert Muir added a comment -

          From the documentation perspective, the problem is not really FIlterCodec's problem. Its more general really.

          Show
          Robert Muir added a comment - From the documentation perspective, the problem is not really FIlterCodec's problem. Its more general really.
          Hide
          Shai Erera added a comment -

          And separately, i.e. separate issue, not for 4.0, I think we should make Lucene40Codec completely final. With FilterCodec, whoever wants to extend Lucene40Codec should just extend FilterCodec, like AppendingCodec, and use PerFieldPF. But this needs a separate discussion. I'll open a separate issue.

          Show
          Shai Erera added a comment - And separately, i.e. separate issue, not for 4.0, I think we should make Lucene40Codec completely final. With FilterCodec, whoever wants to extend Lucene40Codec should just extend FilterCodec, like AppendingCodec, and use PerFieldPF. But this needs a separate discussion. I'll open a separate issue.
          Hide
          Robert Muir added a comment -

          I would prefer something like this patch over the abstract method we currently have, especially if we have some way to throw an exception in the
          SPI loader if you do a chicken-egg.

          Can we make a separate patch to throw an exception if you do this? I definitely would support this null check for 4.0, because currently
          you will get a confusing exception.

          Show
          Robert Muir added a comment - I would prefer something like this patch over the abstract method we currently have, especially if we have some way to throw an exception in the SPI loader if you do a chicken-egg. Can we make a separate patch to throw an exception if you do this? I definitely would support this null check for 4.0, because currently you will get a confusing exception.
          Hide
          Shai Erera added a comment -

          I agree we should add a null check in SPILoader's lookup method (not Codec.forName) to throw IllegalStateException("Cannot requests codec until service provider interface has finished loading them").

          This won't help. The NPE is thrown from Codec.forName because loader is null ! It confused me, since I thought static members should be initialized before any method is accessed, but perhaps this doesn't hold for static methods?

          From the documentation perspective, the problem is not really FIlterCodec's problem. Its more general really.

          Yes, that's why I suggested to add this comment to Codec.forName(). Oh, unless you mean that we should add that comment not as part of this issue?

          Show
          Shai Erera added a comment - I agree we should add a null check in SPILoader's lookup method (not Codec.forName) to throw IllegalStateException("Cannot requests codec until service provider interface has finished loading them"). This won't help. The NPE is thrown from Codec.forName because loader is null ! It confused me, since I thought static members should be initialized before any method is accessed, but perhaps this doesn't hold for static methods? From the documentation perspective, the problem is not really FIlterCodec's problem. Its more general really. Yes, that's why I suggested to add this comment to Codec.forName(). Oh, unless you mean that we should add that comment not as part of this issue?
          Hide
          Uwe Schindler added a comment -

          Maybe we should also test loader against null in Codec.forName to throw a better exception than a NPE whenever initialization issues happen (or even better fix this issue, but I am not sure it is possible...).

          loader should be always != null. The problem is that NamedSPILoader.lookup() may NPE/fail. I can at least add a test that verifies what happens when you do call Codec.forName() inside a Codec ctor.

          Show
          Uwe Schindler added a comment - Maybe we should also test loader against null in Codec.forName to throw a better exception than a NPE whenever initialization issues happen (or even better fix this issue, but I am not sure it is possible...). loader should be always != null. The problem is that NamedSPILoader.lookup() may NPE/fail. I can at least add a test that verifies what happens when you do call Codec.forName() inside a Codec ctor.
          Hide
          Uwe Schindler added a comment -

          Ah you are right, it might be == null. Beacuse at the time of init (until NamedSPILoader's ctor has finished), the assignment is not yet done. So we must null-check in Codec class itsself to throw Exception.

          Show
          Uwe Schindler added a comment - Ah you are right, it might be == null. Beacuse at the time of init (until NamedSPILoader's ctor has finished), the assignment is not yet done. So we must null-check in Codec class itsself to throw Exception.
          Hide
          Uwe Schindler added a comment -

          Inside NamedSPILoader loader all is fine, because the List is initialized to be empty initially and later replaced by a unmodifiable reloaded one.

          Show
          Uwe Schindler added a comment - Inside NamedSPILoader loader all is fine, because the List is initialized to be empty initially and later replaced by a unmodifiable reloaded one.
          Hide
          Shai Erera added a comment -

          Isn't there a way to ensure that NamedSPILoader would finish before forName() is accessed? If not, we need to add such check to every forName we have. Here's a proposal message:

              if (loader == null) {
                throw new IllegalStateException("you are probably calling this method from inside a Codec's ctor); you shouldn't!");
              }
          
          Show
          Shai Erera added a comment - Isn't there a way to ensure that NamedSPILoader would finish before forName() is accessed? If not, we need to add such check to every forName we have. Here's a proposal message: if (loader == null ) { throw new IllegalStateException( "you are probably calling this method from inside a Codec's ctor); you shouldn't!" ); }
          Hide
          Uwe Schindler added a comment -

          Please add this also to listCodecs maybe

          Show
          Uwe Schindler added a comment - Please add this also to listCodecs maybe
          Hide
          Uwe Schindler added a comment -

          Isn't there a way to ensure that NamedSPILoader would finish before forName() is accessed?

          Unfortunately not, this is a chicken-and-egg problem: If you access Codec.class in any way (e.g. already at the time when you subclass it!), it is classloaded. After classloaded, it is initialized, so its static <clinit> method is called. This method runs new NamedSPILoader(Codec.class) [the first line in the class source code], and this one loads all codecs which are listed in META-INF/services. As e.g. AppendingCodec is listed there, it is loaded by NamedSPILoader and its ctor is called. As we are still in the stacktrace inside NamedSPILoader.<init>, this call did not yet return to caller (Codec.<clinit>), so the assignment of the static final loader field is not yet finished. Unfortunately, AppendingCodec.<init> calls Codec.forName() and hits the NPE, because the field is not yet initialized. You have to say: You cannot call any method of Codec at that time!

          Show
          Uwe Schindler added a comment - Isn't there a way to ensure that NamedSPILoader would finish before forName() is accessed? Unfortunately not, this is a chicken-and-egg problem: If you access Codec.class in any way (e.g. already at the time when you subclass it!), it is classloaded. After classloaded, it is initialized, so its static <clinit> method is called. This method runs new NamedSPILoader(Codec.class) [the first line in the class source code] , and this one loads all codecs which are listed in META-INF/services. As e.g. AppendingCodec is listed there, it is loaded by NamedSPILoader and its ctor is called. As we are still in the stacktrace inside NamedSPILoader.<init>, this call did not yet return to caller (Codec.<clinit>), so the assignment of the static final loader field is not yet finished. Unfortunately, AppendingCodec.<init> calls Codec.forName() and hits the NPE, because the field is not yet initialized. You have to say: You cannot call any method of Codec at that time!
          Hide
          Uwe Schindler added a comment -

          Patch adding the Exceptions to Codec and PostingsFormat.

          The Analysis SPI factory loading mechanism does not have the chicken-egg problem:

          • It returns new instances and has no forName() method.
          • Static class ctors from Subclasses are not executed while SPI loading, ass SPIClassIterator does not initialize classes while loading. The first time a factory is touched is when a new instance is created and that happens not while SPI is initializing

          Please check the messages, I want to commit this ASAP, because Robert wants to spin a new RC.

          Show
          Uwe Schindler added a comment - Patch adding the Exceptions to Codec and PostingsFormat. The Analysis SPI factory loading mechanism does not have the chicken-egg problem: It returns new instances and has no forName() method. Static class ctors from Subclasses are not executed while SPI loading, ass SPIClassIterator does not initialize classes while loading. The first time a factory is touched is when a new instance is created and that happens not while SPI is initializing Please check the messages, I want to commit this ASAP, because Robert wants to spin a new RC.
          Hide
          Robert Muir added a comment -

          thanks for adding these checks: I think this is much more clear than NPE!

          Show
          Robert Muir added a comment - thanks for adding these checks: I think this is much more clear than NPE!
          Hide
          Uwe Schindler added a comment -

          Committed trunk revision: 1391140
          Committed 4.x branch revision: 1391143
          Committed 4.0 branch revision: 1391144

          Show
          Uwe Schindler added a comment - Committed trunk revision: 1391140 Committed 4.x branch revision: 1391143 Committed 4.0 branch revision: 1391144
          Hide
          Shai Erera added a comment -

          Thanks Uwe for adding the messages and committing.

          Show
          Shai Erera added a comment - Thanks Uwe for adding the messages and committing.
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] Uwe Schindler
          http://svn.apache.org/viewvc?view=revision&revision=1391143

          Merged revision(s) 1391140 from lucene/dev/trunk:
          LUCENE-4440: Refactor FilterCodec to get the delegate passed into ctor. Fix Codec's and PostingsFormat's forName lookup methods to throw IllegalStateException if you call those methods from the constructor of e.g. FilterCodec.

          Show
          Commit Tag Bot added a comment - [branch_4x commit] Uwe Schindler http://svn.apache.org/viewvc?view=revision&revision=1391143 Merged revision(s) 1391140 from lucene/dev/trunk: LUCENE-4440 : Refactor FilterCodec to get the delegate passed into ctor. Fix Codec's and PostingsFormat's forName lookup methods to throw IllegalStateException if you call those methods from the constructor of e.g. FilterCodec.

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Shai Erera
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development