Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.4, 3.0.3, 3.1, 4.0-ALPHA
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      MMapDirectory uses chunking with MultiMMapIndexInput.

      Because Java's ByteBuffer uses an int to address the
      values, it's necessary to access a file >
      Integer.MAX_VALUE in size using multiple byte buffers.

      But i noticed from the clover report the entire MultiMMapIndexInput class is completely untested: no surprise since all tests make tiny indexes.

      1. LUCENE-2627_test.patch
        4 kB
        Robert Muir
      2. LUCENE-2627.patch
        4 kB
        Robert Muir

        Activity

        Robert Muir created issue -
        Hide
        Robert Muir added a comment -

        attached is a random test case (wired to a value where it fails quickly for standard codec):

        ant test-core -Dtestcase=TestMultiMMap -Dtests.codec=Standard

        junit-sequential:
            [junit] Testsuite: org.apache.lucene.store.TestMultiMMap
            [junit] Testcase: testRandomChunkSizes(org.apache.lucene.store.TestMultiMMap):      Caused an ERROR
            [junit] 233
            [junit] java.lang.ArrayIndexOutOfBoundsException: 233
            [junit]     at org.apache.lucene.store.MMapDirectory$MultiMMapIndexInput.seek(MMapDirectory.java:371)
            [junit]     at org.apache.lucene.store.MMapDirectory$MultiMMapIndexInput.clone(MMapDirectory.java:394)
            [junit]     at org.apache.lucene.index.codecs.standard.StandardTermsDictReader$FieldReader$SegmentTermsEnum.<init>(S
        tandardTermsDictReader.java:288)
            [junit]     at org.apache.lucene.index.codecs.standard.StandardTermsDictReader$FieldReader.iterator(StandardTermsDic
        tReader.java:270)
            [junit]     at org.apache.lucene.index.codecs.standard.StandardTermsDictReader$TermFieldsEnum.terms(StandardTermsDic
        tReader.java:240)
            [junit]     at org.apache.lucene.index.MultiFieldsEnum.terms(MultiFieldsEnum.java:103)
            [junit]     at org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:49)
            [junit]     at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:657)
            [junit]     at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:154)
        
        Show
        Robert Muir added a comment - attached is a random test case (wired to a value where it fails quickly for standard codec): ant test-core -Dtestcase=TestMultiMMap -Dtests.codec=Standard junit-sequential: [junit] Testsuite: org.apache.lucene.store.TestMultiMMap [junit] Testcase: testRandomChunkSizes(org.apache.lucene.store.TestMultiMMap): Caused an ERROR [junit] 233 [junit] java.lang.ArrayIndexOutOfBoundsException: 233 [junit] at org.apache.lucene.store.MMapDirectory$MultiMMapIndexInput.seek(MMapDirectory.java:371) [junit] at org.apache.lucene.store.MMapDirectory$MultiMMapIndexInput.clone(MMapDirectory.java:394) [junit] at org.apache.lucene.index.codecs.standard.StandardTermsDictReader$FieldReader$SegmentTermsEnum.<init>(S tandardTermsDictReader.java:288) [junit] at org.apache.lucene.index.codecs.standard.StandardTermsDictReader$FieldReader.iterator(StandardTermsDic tReader.java:270) [junit] at org.apache.lucene.index.codecs.standard.StandardTermsDictReader$TermFieldsEnum.terms(StandardTermsDic tReader.java:240) [junit] at org.apache.lucene.index.MultiFieldsEnum.terms(MultiFieldsEnum.java:103) [junit] at org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:49) [junit] at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:657) [junit] at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:154)
        Robert Muir made changes -
        Field Original Value New Value
        Attachment LUCENE-2627_test.patch [ 12453254 ]
        Hide
        Robert Muir added a comment -

        off-by-one: here's a patch with the test-case (i removed the seed)

        Show
        Robert Muir added a comment - off-by-one: here's a patch with the test-case (i removed the seed)
        Robert Muir made changes -
        Attachment LUCENE-2627.patch [ 12453263 ]
        Hide
        Robert Muir added a comment -

        for this to fail, your file size has to be an exact multiple of the chunk size (default=Integer.MAX_VALUE), so its not a big deal, but I think we should fix it.

        Show
        Robert Muir added a comment - for this to fail, your file size has to be an exact multiple of the chunk size (default=Integer.MAX_VALUE), so its not a big deal, but I think we should fix it.
        Hide
        Uwe Schindler added a comment -

        Thanks Robert for investigating! I was wondering why I have never seen this error with my 10 Gig CFS file and MMapDir - it just happens on exact multiples of 2^31

        But we should fix this for 2.9 and 3.0, too - its easy. Maybe we have another release (we also have the NRQ bug)

        Show
        Uwe Schindler added a comment - Thanks Robert for investigating! I was wondering why I have never seen this error with my 10 Gig CFS file and MMapDir - it just happens on exact multiples of 2^31 But we should fix this for 2.9 and 3.0, too - its easy. Maybe we have another release (we also have the NRQ bug)
        Robert Muir made changes -
        Assignee Robert Muir [ rcmuir ]
        Hide
        Robert Muir added a comment -

        But we should fix this for 2.9 and 3.0, too - its easy. Maybe we have another release (we also have the NRQ bug)

        OK, i'll test each branch and backport as needed.

        Additionally all tests pass with -Dtests.directory=MMapDirectory, so I plan to commit shortly.

        Show
        Robert Muir added a comment - But we should fix this for 2.9 and 3.0, too - its easy. Maybe we have another release (we also have the NRQ bug) OK, i'll test each branch and backport as needed. Additionally all tests pass with -Dtests.directory=MMapDirectory, so I plan to commit shortly.
        Hide
        Uwe Schindler added a comment -

        For not breaking hudson on 32 bit JVMs, we should enable the MMap close hack when testing against that dir, else we may run out of address space.

        Show
        Uwe Schindler added a comment - For not breaking hudson on 32 bit JVMs, we should enable the MMap close hack when testing against that dir, else we may run out of address space.
        Hide
        Robert Muir added a comment -

        good idea, I will have the test enable the unmap hack, if supported

        Show
        Robert Muir added a comment - good idea, I will have the test enable the unmap hack, if supported
        Hide
        Robert Muir added a comment -

        Committed revisions:

        trunk: 990281
        3.x: 990286
        3.0: 990293
        2.9: 990295

        Show
        Robert Muir added a comment - Committed revisions: trunk: 990281 3.x: 990286 3.0: 990293 2.9: 990295
        Robert Muir made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 2.9.4 [ 12315148 ]
        Fix Version/s 3.0.3 [ 12315147 ]
        Fix Version/s 3.1 [ 12314822 ]
        Fix Version/s 4.0 [ 12314025 ]
        Resolution Fixed [ 1 ]
        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Mark Thomas made changes -
        Workflow jira [ 12519125 ] Default workflow, editable Closed status [ 12564016 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12564016 ] jira [ 12585486 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        4h 51m 1 Robert Muir 27/Aug/10 23:54
        Resolved Resolved Closed Closed
        95d 15h 55m 1 Uwe Schindler 01/Dec/10 14:49

          People

          • Assignee:
            Robert Muir
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development