Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.9, Trunk
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This one has fallen behind...
      It picks TABLE/GCD even when it won't actually save space or help, writes with BlockpackedWriter even when it won't save space, etc.

      Instead of comparing PackedInts.bitsRequired, factor in acceptableOverheadRatio too to determine "will save space". Check if blocking will save space along the same lines (otherwise use regular packed ints).

      Fix a similar bug in Lucene49 codec along these same lines (comparing PackedInts.bitsRequired instead of what would actually be written)

      1. LUCENE-5751.patch
        14 kB
        Robert Muir
      2. LUCENE-5751.patch
        14 kB
        Robert Muir

        Activity

        Hide
        ASF subversion and git services added a comment -

        Commit 1601936 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1601936 ]

        LUCENE-5751: Bring MemoryDocValues up to speed

        Show
        ASF subversion and git services added a comment - Commit 1601936 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1601936 ] LUCENE-5751 : Bring MemoryDocValues up to speed
        Hide
        ASF subversion and git services added a comment -

        Commit 1601929 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1601929 ]

        LUCENE-5751: Bring MemoryDocValues up to speed

        Show
        ASF subversion and git services added a comment - Commit 1601929 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1601929 ] LUCENE-5751 : Bring MemoryDocValues up to speed
        Hide
        Adrien Grand added a comment -

        +1

        Show
        Adrien Grand added a comment - +1
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Robert Muir added a comment -

        Good point: i updated the patch. In general it shouldnt be too sensitive since it only looks for large differences, but I agree its better to just use a float avg!

        Show
        Robert Muir added a comment - Good point: i updated the patch. In general it shouldnt be too sensitive since it only looks for large differences, but I agree its better to just use a float avg!
        Hide
        Adrien Grand added a comment -

        Hmm, should avgBPV be stored on a float? I think it can favor blocking too much otherwise if eg. 8.9 becomes 8 in your heuristic. Otherwise, it looks good to me!

        Show
        Adrien Grand added a comment - Hmm, should avgBPV be stored on a float? I think it can favor blocking too much otherwise if eg. 8.9 becomes 8 in your heuristic. Otherwise, it looks good to me!
        Hide
        Robert Muir added a comment -

        Patch: i see significant performance improvements with this codec, sometimes > 50% for numerics/strings.

        Show
        Robert Muir added a comment - Patch: i see significant performance improvements with this codec, sometimes > 50% for numerics/strings.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development