Lucene - Core
  1. Lucene - Core
  2. LUCENE-3255

Corrupted segment file not detected and wipes index contents

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.9.4, 3.2
    • Fix Version/s: 3.4
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Lucene will happily wipe an existing index if presented with a latest generation segments_n file of all zeros. File format documentation says segments_N files should start with a format of -9 but SegmentInfos.read accepts >=0 as valid for backward compatibility reasons.

      1. AllZerosSegmentFile
        0.0 kB
        Mark Harwood
      2. BadSegmentsFileTest.java
        3 kB
        Mark Harwood
      3. CorruptionCheckerForPreLucene3.java
        0.9 kB
        Mark Harwood
      4. LUCENE-3255_testcase.patch
        3 kB
        Simon Willnauer
      5. LUCENE-3255.patch
        5 kB
        Michael McCandless

        Activity

        Hide
        Greg Tarr added a comment -

        Thanks for this Mark. A speedy resolution would be extremely helpful for confidence in our lucene-based implementation.

        Show
        Greg Tarr added a comment - Thanks for this Mark. A speedy resolution would be extremely helpful for confidence in our lucene-based implementation.
        Hide
        Simon Willnauer added a comment -

        here is a more self contained testcase showing the problem - applies on 3.x and trunk

        Show
        Simon Willnauer added a comment - here is a more self contained testcase showing the problem - applies on 3.x and trunk
        Hide
        Michael McCandless added a comment -

        Nice catch! Indeed, because of back compat, we read a leading 0 as being an ancient format, and then interpret the next 0 to mean index has no segments.

        However, that ancient format predates 1.9, so the fix for 3.x is easy (remove support for this ancient format).

        Not sure what to do if we really need to fix this in pre-3.x releases...

        Show
        Michael McCandless added a comment - Nice catch! Indeed, because of back compat, we read a leading 0 as being an ancient format, and then interpret the next 0 to mean index has no segments. However, that ancient format predates 1.9, so the fix for 3.x is easy (remove support for this ancient format). Not sure what to do if we really need to fix this in pre-3.x releases...
        Hide
        Michael McCandless added a comment -

        4.0 is not affected because we had already removed this back compat code.

        Show
        Michael McCandless added a comment - 4.0 is not affected because we had already removed this back compat code.
        Hide
        Simon Willnauer added a comment -

        Not sure what to do if we really need to fix this in pre-3.x releases...

        there is not much todo really here. I don't see a good way to fix this there. :/

        Show
        Simon Willnauer added a comment - Not sure what to do if we really need to fix this in pre-3.x releases... there is not much todo really here. I don't see a good way to fix this there. :/
        Hide
        Michael McCandless added a comment -

        Patch; I'll commit shortly...

        Show
        Michael McCandless added a comment - Patch; I'll commit shortly...
        Hide
        Mark Harwood added a comment -

        Thanks for the quick turnaround Mike/Simon.

        Greg is on 2.9.x and so the suggestion I have is that he adds some checking code in the app to remove the latest segments_n file if it looks to have anything other than -9 as a format value given he knows that his Lucene indexes should always be of that version. Maybe that could be a utility class that can be posted here on this issue for others who might share this issue. I'm guessing it's a quirk of the file system to leave that all-zeros file in place prior to a flush of some kind?

        Show
        Mark Harwood added a comment - Thanks for the quick turnaround Mike/Simon. Greg is on 2.9.x and so the suggestion I have is that he adds some checking code in the app to remove the latest segments_n file if it looks to have anything other than -9 as a format value given he knows that his Lucene indexes should always be of that version. Maybe that could be a utility class that can be posted here on this issue for others who might share this issue. I'm guessing it's a quirk of the file system to leave that all-zeros file in place prior to a flush of some kind?
        Hide
        Michael McCandless added a comment -

        That sounds like a good fix for 2.9.x.

        +1 for posting a utility here.

        I'm guessing it's a quirk of the file system to leave that all-zeros file in place prior to a flush of some kind?

        Actually I think various filesystems could conceivably do this (write all 0s to a file), eg on OS/hardware crash, if the file was written by not yet sync'd.

        Show
        Michael McCandless added a comment - That sounds like a good fix for 2.9.x. +1 for posting a utility here. I'm guessing it's a quirk of the file system to leave that all-zeros file in place prior to a flush of some kind? Actually I think various filesystems could conceivably do this (write all 0s to a file), eg on OS/hardware crash, if the file was written by not yet sync'd.
        Hide
        Simon Willnauer added a comment -

        I wonder if we could check the latest segments_N file against a previous one and if the version of that file is older that the previous one we drop it since we don't provide forward compatibility. we could even spin a 2.9 bugfix release for that though.

        Show
        Simon Willnauer added a comment - I wonder if we could check the latest segments_N file against a previous one and if the version of that file is older that the previous one we drop it since we don't provide forward compatibility. we could even spin a 2.9 bugfix release for that though.
        Hide
        Michael McCandless added a comment -

        Ahh, good idea Simon! That should work.

        Another simple thing we could do is, throw an exc if we did not consume all bytes from the segments file. In this case, the segments file was 20 bytes long, but the double 0 ints only consumes 8 bytes (two 0 ints).

        Show
        Michael McCandless added a comment - Ahh, good idea Simon! That should work. Another simple thing we could do is, throw an exc if we did not consume all bytes from the segments file. In this case, the segments file was 20 bytes long, but the double 0 ints only consumes 8 bytes (two 0 ints).
        Hide
        Michael McCandless added a comment -

        Actually, we can do something even simpler here: in the 1.9.x days Lucene never wrote a generation (_N) segments file. It always wrote just "segments", so, if we see first int is a 0, and the file has a generation in it, then it's corrupt.

        Show
        Michael McCandless added a comment - Actually, we can do something even simpler here: in the 1.9.x days Lucene never wrote a generation (_N) segments file. It always wrote just "segments", so, if we see first int is a 0, and the file has a generation in it, then it's corrupt.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Mark Harwood
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development