Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7703

Record the version that was used at index creation time

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      SegmentInfos already records the version that was used to write a commit and the version that was used to write the oldest segment in the index. In addition to those, I think it could be useful to record the Lucene version that was used to create the index. I think it could help with:

      • Debugging: there are things that change based on Lucene versions, for instance we will reject broken offsets in term vectors as of 7.0. Knowing the version that was used to create the index can be very useful to know what assumptions we can make about an index.
      • Backward compatibility. The codec API helped simplify backward compatibility of the index files a lot. However for everything that is done on top of the codec API like analysis or the computation of length norm factors, backward compatibility needs to be handled on top of Lucene. Maybe we could simplify this?
      1. LUCENE-7703.patch
        24 kB
        Adrien Grand

        Activity

        Hide
        mikemccand Michael McCandless added a comment -

        +1, I think this is important info for the index.

        Show
        mikemccand Michael McCandless added a comment - +1, I think this is important info for the index.
        Hide
        jpountz Adrien Grand added a comment -

        Thanks Mike for confirming it would be useful. So I gave it a try, see the attached patch. It should be fine for regular usage of IndexWriter with calls to add/updateDocument. However it is not totally clear to me how we should deal with addIndexes. For the one that takes a list of codec readers, I don't think there is much we can do anyway since the version is not exposed (and it would not make much sense anyway?). For the one that takes a list of directories, we could either reject the call if versions differ (this is what the patch is doing), or be lenient but this has the major drawback that any assumptions we might make based on the created version could break. Any opinions?

        Show
        jpountz Adrien Grand added a comment - Thanks Mike for confirming it would be useful. So I gave it a try, see the attached patch. It should be fine for regular usage of IndexWriter with calls to add/updateDocument. However it is not totally clear to me how we should deal with addIndexes. For the one that takes a list of codec readers, I don't think there is much we can do anyway since the version is not exposed (and it would not make much sense anyway?). For the one that takes a list of directories, we could either reject the call if versions differ (this is what the patch is doing), or be lenient but this has the major drawback that any assumptions we might make based on the created version could break. Any opinions?
        Hide
        mikemccand Michael McCandless added a comment -

        +1 to the patch.

        Show
        mikemccand Michael McCandless added a comment - +1 to the patch.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit d9c0f2599d934766549b2566d7c0dd159c3af5c8 in lucene-solr's branch refs/heads/master from Adrien Grand
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d9c0f25 ]

        LUCENE-7703: Record the index creation version.

        Show
        jira-bot ASF subversion and git services added a comment - Commit d9c0f2599d934766549b2566d7c0dd159c3af5c8 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d9c0f25 ] LUCENE-7703 : Record the index creation version.

          People

          • Assignee:
            Unassigned
            Reporter:
            jpountz Adrien Grand
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development