Lucene - Core
  1. Lucene - Core
  2. LUCENE-6998

We should detect truncation for all index files

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.5, 6.0, 5.4.2
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Robert Muir noticed that Lucene60PointReader does not detect truncation of its data file on reader init ... so I added a test to catch this, confirmed it caught the bug, and fixed the bug, and fixed a couple other things uncovered by the test.

      I also improved the other TestAllFiles* tests to use LineFileDocs so they index points, and it caught another bug in Lucene60PointFormat (using the same codec name in two files).

      There is more to do here, e.g. we also need a test that catches places where we fail to check the index header on init, which was also missing for Lucene60PointReader.

      1. LUCENE-6998.patch
        30 kB
        Michael McCandless
      2. LUCENE-6998.patch
        19 kB
        Michael McCandless
      3. LUCENE-6998.patch
        19 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch, I think it's close.

        Show
        Michael McCandless added a comment - Patch, I think it's close.
        Hide
        Michael McCandless added a comment -

        Woops, first patch missed the new test ...

        Show
        Michael McCandless added a comment - Woops, first patch missed the new test ...
        Hide
        Robert Muir added a comment -

        +1, especially for this test. These crazy ctors were "manually stared at" before, this is way better.

        after this issue we can look at converting some of the other tests (e.g. TestIndexWriterExceptions2) to also use Points.

        when pushing can you change the comment for the new test (TestAllFilesDetectTruncation) ?

        Show
        Robert Muir added a comment - +1, especially for this test. These crazy ctors were "manually stared at" before, this is way better. after this issue we can look at converting some of the other tests (e.g. TestIndexWriterExceptions2) to also use Points. when pushing can you change the comment for the new test (TestAllFilesDetectTruncation) ?
        Hide
        Michael McCandless added a comment -

        Another iteration, I think it's ready.

        I added two more test cases, one of which catches the missing checkIndexHeader from Lucene60PointReader, the other one verifies that swapping in the same file name from another index into your index is always detected as corruption.

        Show
        Michael McCandless added a comment - Another iteration, I think it's ready. I added two more test cases, one of which catches the missing checkIndexHeader from Lucene60PointReader , the other one verifies that swapping in the same file name from another index into your index is always detected as corruption.
        Hide
        ASF subversion and git services added a comment -

        Commit af3a19574e6242e1b29f2fb8861bbcc977f23435 in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=af3a195 ]

        LUCENE-6998: fix a couple places to better detect truncated index files; improve corruption testing

        Show
        ASF subversion and git services added a comment - Commit af3a19574e6242e1b29f2fb8861bbcc977f23435 in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=af3a195 ] LUCENE-6998 : fix a couple places to better detect truncated index files; improve corruption testing
        Hide
        ASF subversion and git services added a comment -

        Commit 5e5a9cad9aa86555648099c262febe7a516ba48a in lucene-solr's branch refs/heads/branch_5x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5e5a9ca ]

        LUCENE-6998: fix a couple places to better detect truncated index files; improve corruption testing

        Conflicts:
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointFormat.java
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointReader.java
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointWriter.java

        Show
        ASF subversion and git services added a comment - Commit 5e5a9cad9aa86555648099c262febe7a516ba48a in lucene-solr's branch refs/heads/branch_5x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5e5a9ca ] LUCENE-6998 : fix a couple places to better detect truncated index files; improve corruption testing Conflicts: lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointFormat.java lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointReader.java lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointWriter.java
        Hide
        Steve Rowe added a comment -

        Conflicts:
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointFormat.java
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointReader.java
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointWriter.java

        Should we worry about these? (I haven't looked at the details.)

        Show
        Steve Rowe added a comment - Conflicts: lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointFormat.java lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointReader.java lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointWriter.java Should we worry about these? (I haven't looked at the details.)
        Hide
        Michael McCandless added a comment -

        Should we worry about these?

        This is git being wonderfully honest (the same reason why I love the "merge bubbles").

        On backport, git told me there were conflicts, since these files were changed in master but don't exist in 5.x ("points" is a 6.0 only feature). So I removed them, and left this default commit message generated by git to preserve its honesty ....

        So I don't think we need to worry

        Show
        Michael McCandless added a comment - Should we worry about these? This is git being wonderfully honest (the same reason why I love the "merge bubbles"). On backport, git told me there were conflicts, since these files were changed in master but don't exist in 5.x ("points" is a 6.0 only feature). So I removed them, and left this default commit message generated by git to preserve its honesty .... So I don't think we need to worry
        Hide
        Steve Rowe added a comment -

        +1, thanks Mike

        Show
        Steve Rowe added a comment - +1, thanks Mike
        Hide
        Michael McCandless added a comment -

        Reopen for backport to 5.4.2.

        Show
        Michael McCandless added a comment - Reopen for backport to 5.4.2.
        Hide
        ASF subversion and git services added a comment -

        Commit df30bc6c5b4855fcd95c3660fdd2991d0e9c58bf in lucene-solr's branch refs/heads/branch_5_4 from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=df30bc6 ]

        LUCENE-6998: fix a couple places to better detect truncated index files; improve corruption testing

        Conflicts:
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointFormat.java
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointReader.java
        lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointWriter.java

        Conflicts:
        lucene/CHANGES.txt

        Show
        ASF subversion and git services added a comment - Commit df30bc6c5b4855fcd95c3660fdd2991d0e9c58bf in lucene-solr's branch refs/heads/branch_5_4 from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=df30bc6 ] LUCENE-6998 : fix a couple places to better detect truncated index files; improve corruption testing Conflicts: lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointFormat.java lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointReader.java lucene/core/src/java/org/apache/lucene/codecs/lucene60/Lucene60PointWriter.java Conflicts: lucene/CHANGES.txt

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development