Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.6, 4.0-ALPHA
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Its pretty common that positionIncrement can overflow, this happens really easily
      if people write analyzers that don't clearAttributes().

      It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check),
      that IW would throw an exception.

      But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index...

      1. LUCENE-3874_test.patch
        1 kB
        Robert Muir
      2. LUCENE-3874.patch
        2 kB
        Robert Muir

        Activity

        Hide
        Robert Muir added a comment -

        Simple test that overflows posinc.

        Output is:

        junit-sequential:
            [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
            [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.239 sec
            [junit] 
            [junit] ------------- Standard Output ---------------
            [junit] CheckIndex failed
            [junit] Segments file=segments_1 numSegments=1 version=4.0 format=FORMAT_4_0 [Lucene 4.0]
            [junit]   1 of 1: name=_0 docCount=1
            [junit]     codec=SimpleText
            [junit]     compound=false
            [junit]     hasProx=true
            [junit]     numFiles=4
            [junit]     size (MB)=0
            [junit]     diagnostics = {os.version=3.0.0-14-generic, os=Linux, lucene.version=4.0-SNAPSHOT, source=flush, os.arch=amd64, java.version=1.6.0_24, java.vendor=Sun Microsystems Inc.}
            [junit]     has deletions [delGen=-1]
            [junit]     test: open reader.........OK
            [junit]     test: fields..............OK [1 fields]
            [junit]     test: field norms.........OK [1 fields]
            [junit]     test: terms, freq, prox...ERROR: java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out of bounds
            [junit] java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out of bounds
            [junit] 	at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:860)
        ...
        
        Show
        Robert Muir added a comment - Simple test that overflows posinc. Output is: junit-sequential: [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.239 sec [junit] [junit] ------------- Standard Output --------------- [junit] CheckIndex failed [junit] Segments file=segments_1 numSegments=1 version=4.0 format=FORMAT_4_0 [Lucene 4.0] [junit] 1 of 1: name=_0 docCount=1 [junit] codec=SimpleText [junit] compound=false [junit] hasProx=true [junit] numFiles=4 [junit] size (MB)=0 [junit] diagnostics = {os.version=3.0.0-14-generic, os=Linux, lucene.version=4.0-SNAPSHOT, source=flush, os.arch=amd64, java.version=1.6.0_24, java.vendor=Sun Microsystems Inc.} [junit] has deletions [delGen=-1] [junit] test: open reader.........OK [junit] test: fields..............OK [1 fields] [junit] test: field norms.........OK [1 fields] [junit] test: terms, freq, prox...ERROR: java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out of bounds [junit] java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out of bounds [junit] at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:860) ...
        Hide
        Robert Muir added a comment -

        3.x too: just s/TextField/Field to port the test

        Show
        Robert Muir added a comment - 3.x too: just s/TextField/Field to port the test
        Hide
        Robert Muir added a comment -

        first cut at a patch, throws IllegalArgumentException and aborts the doc (ensuring fieldState never sees the overflow since i dont trust what happens to it after this!)

        Show
        Robert Muir added a comment - first cut at a patch, throws IllegalArgumentException and aborts the doc (ensuring fieldState never sees the overflow since i dont trust what happens to it after this!)
        Hide
        Michael McCandless added a comment -

        +1

        Crazy we don't catch this already...

        Show
        Michael McCandless added a comment - +1 Crazy we don't catch this already...
        Hide
        Uwe Schindler added a comment -

        +8

        Show
        Uwe Schindler added a comment - +8
        Hide
        Simon Willnauer added a comment -

        see my comment on LUCENE-3876 - this triggers reproducible failures on trunk.

        Show
        Simon Willnauer added a comment - see my comment on LUCENE-3876 - this triggers reproducible failures on trunk.
        Hide
        Robert Muir added a comment -

        The solution here is 100% correct, we don't need to reopen it because its test found a separate, unrelated bug.

        any shit that limits your values to smaller than integer.max_value needs its own checks to fit, throwing UOE because its choosing not to support totally legitimate values from the analyzer.

        Show
        Robert Muir added a comment - The solution here is 100% correct, we don't need to reopen it because its test found a separate, unrelated bug. any shit that limits your values to smaller than integer.max_value needs its own checks to fit, throwing UOE because its choosing not to support totally legitimate values from the analyzer.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development