Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      It would be nice to have the version of lucene that wrote segments_N, so that we can use this to determine which major version an index was written with (for upgrading across major versions). I think this could be squeezed in just after the segments_N header.

      1. LUCENE-5954.patch
        17 kB
        Michael McCandless
      2. LUCENE-5954.patch
        13 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        This could also be used to throw IndexFormatTooNewException immediately on trying to open an index, whenever the version we see is > Version.LATEST. Uwe Schindler raised this on LUCENE-5952 but I didn't want to make that change there...

        Show
        Michael McCandless added a comment - This could also be used to throw IndexFormatTooNewException immediately on trying to open an index, whenever the version we see is > Version.LATEST. Uwe Schindler raised this on LUCENE-5952 but I didn't want to make that change there...
        Hide
        Michael McCandless added a comment -

        This seems simple...

        Show
        Michael McCandless added a comment - This seems simple...
        Hide
        Robert Muir added a comment -

        I don't want to do what the subject says
        Or well, i want to creep the scope.

        We need the minimum segment version to clean up the IndexFormatTooOldChecks. The version of the commit itself is somewhat unrelated, but its sorta bogus its not exposed to the user. I think currently our IndexUpgrader will fail to properly upgrade an empty commit for example. There isn't even a simple way to check

        IMO any version(s) should be ordered before the segments, so we don't try to e.g. load up old codecs that are too old/not there. This will clean up the current logic of readCodec()

        Show
        Robert Muir added a comment - I don't want to do what the subject says Or well, i want to creep the scope. We need the minimum segment version to clean up the IndexFormatTooOldChecks. The version of the commit itself is somewhat unrelated, but its sorta bogus its not exposed to the user. I think currently our IndexUpgrader will fail to properly upgrade an empty commit for example. There isn't even a simple way to check IMO any version(s) should be ordered before the segments, so we don't try to e.g. load up old codecs that are too old/not there. This will clean up the current logic of readCodec()
        Hide
        Michael McCandless added a comment -

        Initial patch, likely with bugs!

        I added both version segments_N was written with, and "version of oldest segment".

        Show
        Michael McCandless added a comment - Initial patch, likely with bugs! I added both version segments_N was written with, and "version of oldest segment".
        Hide
        Ryan Ernst added a comment -

        Thanks Mike! This looks great.

        A couple questions:

        • Do we really need to add compareTo? Couldn't we use the existing onOrAfter? It seems weird to have two ways of comparing versions.
        • Is there somewhere we could have a more direct test than deletion policy tests? I took a quick look but couldn't find anything unit testing the segment infos reading/writing...
        Show
        Ryan Ernst added a comment - Thanks Mike! This looks great. A couple questions: Do we really need to add compareTo? Couldn't we use the existing onOrAfter? It seems weird to have two ways of comparing versions. Is there somewhere we could have a more direct test than deletion policy tests? I took a quick look but couldn't find anything unit testing the segment infos reading/writing...
        Hide
        Robert Muir added a comment -
        if (format >= VERSION_53) {
          // TODO: in the future (7.0?  sigh) we can use this to throw IndexFormatTooOldException ... or just rely on the
          // minSegmentLuceneVersion check instead:
          infos.luceneVersion = Version.fromBits(input.readVInt(), input.readVInt(), input.readVInt());
        } else {
          // else leave null
        }
        

        I guess I was hoping we could take it further. We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case? Its just the min() that it finds there. Or does this become too hairy?

        Show
        Robert Muir added a comment - if (format >= VERSION_53) { // TODO: in the future (7.0? sigh) we can use this to throw IndexFormatTooOldException ... or just rely on the // minSegmentLuceneVersion check instead: infos.luceneVersion = Version.fromBits(input.readVInt(), input.readVInt(), input.readVInt()); } else { // else leave null } I guess I was hoping we could take it further. We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case? Its just the min() that it finds there. Or does this become too hairy?
        Hide
        Michael McCandless added a comment -

        It seems weird to have two ways of comparing versions.

        Good point, I'll remove the compareTo and just use the existing onOrAfter.

        Is there somewhere we could have a more direct test than deletion policy tests?

        I'll try to make a more direct test (TestSegmentInfos)...

        We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case?

        That's what I do in the patch ... if SIS was written by >= 5.3, I pull the min version that we wrote into the header (and throw IndexTooOldExc if it's too old), else I re-compute it on load.

        Show
        Michael McCandless added a comment - It seems weird to have two ways of comparing versions. Good point, I'll remove the compareTo and just use the existing onOrAfter. Is there somewhere we could have a more direct test than deletion policy tests? I'll try to make a more direct test (TestSegmentInfos)... We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case? That's what I do in the patch ... if SIS was written by >= 5.3, I pull the min version that we wrote into the header (and throw IndexTooOldExc if it's too old), else I re-compute it on load.
        Hide
        Michael McCandless added a comment -

        New patch folding feedback in ... I think it's ready.

        Show
        Michael McCandless added a comment - New patch folding feedback in ... I think it's ready.
        Hide
        Michael McCandless added a comment -

        Oh, on Rob's feedback: I'll change that confusing // else leave null comment to // else compute the min version down below in the for loop.

        Show
        Michael McCandless added a comment - Oh, on Rob's feedback: I'll change that confusing // else leave null comment to // else compute the min version down below in the for loop .
        Hide
        Ryan Ernst added a comment -

        +1 to the new patch.

        I think the inline comment about // use safe maps on the VERSION_53 constant can be removed? Seems like a copy/paste issue from the previous comment?

        Show
        Ryan Ernst added a comment - +1 to the new patch. I think the inline comment about // use safe maps on the VERSION_53 constant can be removed? Seems like a copy/paste issue from the previous comment?
        Hide
        Michael McCandless added a comment -

        Woops, yes, I'll remove that comment and commit ... thanks Ryan Ernst!

        Show
        Michael McCandless added a comment - Woops, yes, I'll remove that comment and commit ... thanks Ryan Ernst !
        Hide
        ASF subversion and git services added a comment -

        Commit 1684489 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1684489 ]

        LUCENE-5954: write oldest segment version, and segments_N version, in the segments file

        Show
        ASF subversion and git services added a comment - Commit 1684489 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1684489 ] LUCENE-5954 : write oldest segment version, and segments_N version, in the segments file
        Hide
        ASF subversion and git services added a comment -

        Commit 1684514 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1684514 ]

        LUCENE-5954: write oldest segment version, and segments_N version, in the segments file

        Show
        ASF subversion and git services added a comment - Commit 1684514 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1684514 ] LUCENE-5954 : write oldest segment version, and segments_N version, in the segments file
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Ryan Ernst
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development