Lucene - Core
  1. Lucene - Core
  2. LUCENE-4456

IndexWriter makes unrefed files, and MockDir cannot detect it

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0, 5.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Because MockDir calls crash() before it checks for unreferenced files, deletes are no longer allowed.

      this means the unreferenced files check is useless!

      1. LUCENE-4456.patch
        4 kB
        Michael McCandless
      2. LUCENE-4456.patch
        47 kB
        Robert Muir
      3. LUCENE-4456_mdw_patch.txt
        0.9 kB
        Robert Muir

        Activity

        Hide
        Robert Muir added a comment -

        here's the patch to fix MDW.

        now more tests other than TestRollingUpdates fail

        Show
        Robert Muir added a comment - here's the patch to fix MDW. now more tests other than TestRollingUpdates fail
        Hide
        Michael McCandless added a comment -

        Patch w/ the fix for just the original TestRollingUpdate fail ... but lots of tests are still failing ... I'll dig.

        Show
        Michael McCandless added a comment - Patch w/ the fix for just the original TestRollingUpdate fail ... but lots of tests are still failing ... I'll dig.
        Hide
        Uwe Schindler added a comment -

        Is this to be fixed in 4.0? If yes make blocker and set fix version!

        Show
        Uwe Schindler added a comment - Is this to be fixed in 4.0? If yes make blocker and set fix version!
        Hide
        Robert Muir added a comment -

        Possibly: I don't want things rushed in or shoved in quickly though.
        We need to take our time here and be careful.
        Its not necessary to fix all bugs before releasing, there are hundreds of bugs in JIRA.

        Show
        Robert Muir added a comment - Possibly: I don't want things rushed in or shoved in quickly though. We need to take our time here and be careful. Its not necessary to fix all bugs before releasing, there are hundreds of bugs in JIRA.
        Hide
        Robert Muir added a comment -

        patch for trunk, showing differences from the branch.

        will let my jenkins chew on this a bit more.

        Show
        Robert Muir added a comment - patch for trunk, showing differences from the branch. will let my jenkins chew on this a bit more.
        Hide
        Robert Muir added a comment -

        12 straight hours on my jenkins: no problems. lets get this in trunk.

        Show
        Robert Muir added a comment - 12 straight hours on my jenkins: no problems. lets get this in trunk.
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Michael McCandless added a comment -

        I committed to trunk ... let's bake this a bit there, and then I'll backport to 4.x and 4.0...

        Show
        Michael McCandless added a comment - I committed to trunk ... let's bake this a bit there, and then I'll backport to 4.x and 4.0...
        Hide
        Robert Muir added a comment -

        This is in 4.x now: just to try give some summary of this insanity...

        1. Uwe's original bug (LUCENE-4455) caused mike to add a sizeInBytes test to TestRollingUpdates
        2. this test failed in jenkins overnight, because of some unreferenced files leftover from indexwriter. so it was clear MockDirectoryWrapper's unrefed files check has not really been working for some time! We now have a "test-the-tester" test for this so we know its working in the future!
        3. when we turned that on, all kinds of test failures happened: some exception/crash cases and what not, some real bugs like deleting non-index files. some fake test failures because tests made multiple commits. other fake test failures because IW couldnt delete files since MockDirectoryWrapper acts like windows (and some reader had them open).
        4. We fixed MockDirectoryWrapper more so that it handles the openDeletedFiles Windows case, and fixed it to handle indexes with multiple commits. More fake test failures because of hard-to-reproduce act-like-Windows corner cases like segments_N and segments.gen (its hard to reproduce the write at exactly the same time as these are slurped).
        5. To try to make those more reproducible, I figured MockDir could sometimes "hang onto" its IndexInputs a little bit longer so this is more likely to happen in tests, by very rarely doing a sleep in its close(). But a side effect of this, is testThreadInterruptDeadLock would sometimes interrupt this sleeping in close, causing an exception to be thrown in an IndexInput's close(). So this test (not intended for this purpose, but nice side effect) finds bugs where exception handling of close() for indexinputs is wrong, because of its unclosed files check.

        And lots of pulling out hair and stuff in between.

        Anyway it probably looks a lot worse than it is, because most of these failures are silly bugs/test bugs/trying to get the test framework right.

        I have two jenkins pointed at branch_4x now:

        Combined these do 70 test-core builds per hour. So every day is like a month's worth of test failures, which makes things look really unstable but its intentional
        to try to hunt everything down at once instead of onesy-twosey for a long time

        Lets give it a little time and see if it can flush anything out and then backport this stuff to 4.0, beast a bit more, and respin.

        Show
        Robert Muir added a comment - This is in 4.x now: just to try give some summary of this insanity... Uwe's original bug ( LUCENE-4455 ) caused mike to add a sizeInBytes test to TestRollingUpdates this test failed in jenkins overnight, because of some unreferenced files leftover from indexwriter. so it was clear MockDirectoryWrapper's unrefed files check has not really been working for some time! We now have a "test-the-tester" test for this so we know its working in the future! when we turned that on, all kinds of test failures happened: some exception/crash cases and what not, some real bugs like deleting non-index files. some fake test failures because tests made multiple commits. other fake test failures because IW couldnt delete files since MockDirectoryWrapper acts like windows (and some reader had them open). We fixed MockDirectoryWrapper more so that it handles the openDeletedFiles Windows case, and fixed it to handle indexes with multiple commits. More fake test failures because of hard-to-reproduce act-like-Windows corner cases like segments_N and segments.gen (its hard to reproduce the write at exactly the same time as these are slurped). To try to make those more reproducible, I figured MockDir could sometimes "hang onto" its IndexInputs a little bit longer so this is more likely to happen in tests, by very rarely doing a sleep in its close(). But a side effect of this, is testThreadInterruptDeadLock would sometimes interrupt this sleeping in close, causing an exception to be thrown in an IndexInput's close(). So this test (not intended for this purpose, but nice side effect) finds bugs where exception handling of close() for indexinputs is wrong, because of its unclosed files check. And lots of pulling out hair and stuff in between. Anyway it probably looks a lot worse than it is, because most of these failures are silly bugs/test bugs/trying to get the test framework right. I have two jenkins pointed at branch_4x now: http://sierranevada.servebeer.com/job/branch4x-beaster/ (Linux) http://sierranevada.servebeer.com:8080/job/slow-io-beasting/ (Windows) Combined these do 70 test-core builds per hour. So every day is like a month's worth of test failures, which makes things look really unstable but its intentional to try to hunt everything down at once instead of onesy-twosey for a long time Lets give it a little time and see if it can flush anything out and then backport this stuff to 4.0, beast a bit more, and respin.
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Robert Muir
        http://svn.apache.org/viewvc?view=revision&revision=1394561

        LUCENE-4456: clean up more exception handling

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revision&revision=1394561 LUCENE-4456 : clean up more exception handling
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1394466

        LUCENE-4456: close merged readers first, then checkpoint

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1394466 LUCENE-4456 : close merged readers first, then checkpoint
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Robert Muir
        http://svn.apache.org/viewvc?view=revision&revision=1394324

        LUCENE-4456: add another sleeper to test exception handling

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revision&revision=1394324 LUCENE-4456 : add another sleeper to test exception handling
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Robert Muir
        http://svn.apache.org/viewvc?view=revision&revision=1394309

        LUCENE-4456: more fixes that are only exposed by additional random sleeps

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revision&revision=1394309 LUCENE-4456 : more fixes that are only exposed by additional random sleeps
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1393802

        LUCENE-4456: backport to 4.x

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1393802 LUCENE-4456 : backport to 4.x
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development