Lucene - Core
  1. Lucene - Core
  2. LUCENE-3539

IndexFormatTooOld/NewExc should try to include fileName + directory when possible

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      (Spinoff from http://markmail.org/thread/t6s7nn3ve765nojc )

      When we throw a too old/new exc we should try to include the full path to the offending file, if possible.

      1. LUCENE-3539.patch
        45 kB
        Michael McCandless
      2. LUCENE-3539.patch
        45 kB
        Michael McCandless
      3. LUCENE-3539.patch
        28 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch. I now pass DataInput down to IndexFormatTooNew/OldExc, and
        .toString() it, and impl'd .toString in the all the IndexInput impls I
        could find.

        Show
        Michael McCandless added a comment - Patch. I now pass DataInput down to IndexFormatTooNew/OldExc, and .toString() it, and impl'd .toString in the all the IndexInput impls I could find.
        Hide
        Robert Muir added a comment -

        I dont think i like all the duplication/extra tracking in every indexinput impl...

        In my opinion its not worth it!

        can this be done in a cleaner way?

        Show
        Robert Muir added a comment - I dont think i like all the duplication/extra tracking in every indexinput impl... In my opinion its not worth it! can this be done in a cleaner way?
        Hide
        Uwe Schindler added a comment -

        I agree with Robert, adding this to IndexInput is stupid.

        I think, only adding the file name as done before should be fine. There are only few places where we pass null as file name to the exception (whoch may be fixed). Passing the whole directory name is in my opinion useless. Its up to the implementation using lucene to keep track of its directory name (when it opens a IndexReader it already knows its dir name).

        I would close this as won't fix and maybe only fix the remaining places that misses the file name (e.g. SegmentTermsEnumReader).

        Show
        Uwe Schindler added a comment - I agree with Robert, adding this to IndexInput is stupid. I think, only adding the file name as done before should be fine. There are only few places where we pass null as file name to the exception (whoch may be fixed). Passing the whole directory name is in my opinion useless. Its up to the implementation using lucene to keep track of its directory name (when it opens a IndexReader it already knows its dir name). I would close this as won't fix and maybe only fix the remaining places that misses the file name (e.g. SegmentTermsEnumReader).
        Hide
        Robert Muir added a comment -

        you might be able to work me down to a partial path rather than a full path...

        like if IndexInput takes String name in its ctor (the same one passed to Directory.openInput NOT the full path, keeps String as a private variable), and implements toString itself.

        then we wouldnt have to track additional variables in each indexinput impl, only change openinput and the ctors to pass this information.

        But i'm still not sure how useful this is.

        It really seems like an implementation detail that we check the stored fields to determine if an indexformat is too old. who cares what the file name is?

        Show
        Robert Muir added a comment - you might be able to work me down to a partial path rather than a full path... like if IndexInput takes String name in its ctor (the same one passed to Directory.openInput NOT the full path, keeps String as a private variable), and implements toString itself. then we wouldnt have to track additional variables in each indexinput impl, only change openinput and the ctors to pass this information. But i'm still not sure how useful this is. It really seems like an implementation detail that we check the stored fields to determine if an indexformat is too old. who cares what the file name is?
        Hide
        Robert Muir added a comment -

        Some more thoughts:

        • maybe IndexInput should have a ctor like IndexInput(String resourceDescription)
        • then, in all error messages (e.g. read past EOF) we could include this, for better error descriptions
        • when directories call openInput they could pass whatever they want (e.g. filename), and CFS could include the "real" file within cfs
        • we could declare this as just an opaque string describing the indexinput (yeah and II could use it in its toString() as well)
        • if we do this, i think it should be non-null (e.g. in 3.x we add no-arg ctor that forwards to "no additional details" or whatever)

        If we did it consistently like this, i think it could improve error messages and debugging and be a pretty clean change.

        Show
        Robert Muir added a comment - Some more thoughts: maybe IndexInput should have a ctor like IndexInput(String resourceDescription) then, in all error messages (e.g. read past EOF) we could include this, for better error descriptions when directories call openInput they could pass whatever they want (e.g. filename), and CFS could include the "real" file within cfs we could declare this as just an opaque string describing the indexinput (yeah and II could use it in its toString() as well) if we do this, i think it should be non-null (e.g. in 3.x we add no-arg ctor that forwards to "no additional details" or whatever) If we did it consistently like this, i think it could improve error messages and debugging and be a pretty clean change.
        Hide
        Michael McCandless added a comment -

        +1, I like that solution, Robert. I'll rework the patch...

        Show
        Michael McCandless added a comment - +1, I like that solution, Robert. I'll rework the patch...
        Hide
        Michael McCandless added a comment -

        New patch folding in Robert's idea....

        I added final String resourceDescription to II, returned from
        toString, made it required arg to the ctor, and fix all II subclasses
        to pass something reasonable.

        When our II impls originate an exception (eg from EOF), I also include
        II.toString(); if a method they call throws IOE (eg file.read(...)
        inside SimpleFSII), then I catch & rethrow w/ II.toString() included.

        I also include the sub-file name when inside a sliced II (CFS/CFX);
        I added a required arg (sliceDescription) to the sliceInput method
        for this.

        Show
        Michael McCandless added a comment - New patch folding in Robert's idea.... I added final String resourceDescription to II, returned from toString, made it required arg to the ctor, and fix all II subclasses to pass something reasonable. When our II impls originate an exception (eg from EOF), I also include II.toString(); if a method they call throws IOE (eg file.read(...) inside SimpleFSII), then I catch & rethrow w/ II.toString() included. I also include the sub-file name when inside a sliced II (CFS/CFX); I added a required arg (sliceDescription) to the sliceInput method for this.
        Hide
        Robert Muir added a comment -

        Patch looks good, only one minor idea for improvement:

        When we throw "read past EOF" I think we should throw java.io.EOFException (extends IOException) rather than a generic one?

        Show
        Robert Muir added a comment - Patch looks good, only one minor idea for improvement: When we throw "read past EOF" I think we should throw java.io.EOFException (extends IOException) rather than a generic one?
        Hide
        Michael McCandless added a comment -

        OK, new patch; I changed to EOFE, and I just use in.toString() instead of special casing DI vs II.

        I think it's ready!

        Show
        Michael McCandless added a comment - OK, new patch; I changed to EOFE, and I just use in.toString() instead of special casing DI vs II. I think it's ready!
        Hide
        Simon Willnauer added a comment -

        mike can we close LUCENE-3138 too?

        Show
        Simon Willnauer added a comment - mike can we close LUCENE-3138 too?
        Hide
        Michael McCandless added a comment -

        mike can we close LUCENE-3138 too?

        Hmm I don't think so. It's unrelated to my fixes here? Here I just "improved" the error message you see when you get an IOE, CorruptIndexE, IndexFormatTooXE out of Lucene.

        Show
        Michael McCandless added a comment - mike can we close LUCENE-3138 too? Hmm I don't think so. It's unrelated to my fixes here? Here I just "improved" the error message you see when you get an IOE, CorruptIndexE, IndexFormatTooXE out of Lucene.
        Hide
        Uwe Schindler added a comment -

        Bulk close after release of 3.5

        Show
        Uwe Schindler added a comment - Bulk close after release of 3.5

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development