Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.1
    • Fix Version/s: 3.4, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      it would be useful to have an option to discard positional information but still keep the term frequency - currently setOmitTermFreqAndPositions discards both. Even though position-dependent queries wouldn't work in such case, still any other queries would work fine and we would get the right scoring.

      1. LUCENE-2048.patch
        109 kB
        Robert Muir
      2. LUCENE-2048.patch
        152 kB
        Robert Muir

        Issue Links

          Activity

          Hide
          Michael McCandless added a comment -

          Looks great! +1 to commit.

          Show
          Michael McCandless added a comment - Looks great! +1 to commit.
          Hide
          Robert Muir added a comment -

          ok here's a updated patch. I think its ready to commit!

          Show
          Robert Muir added a comment - ok here's a updated patch. I think its ready to commit!
          Hide
          Robert Muir added a comment -

          i created a throwaway branch: branches/omitp, to hopefully sucker mike into helping me with some random fails (always pulsing is involved!)

          in general the pulsing cutover was tricky for me.

          Show
          Robert Muir added a comment - i created a throwaway branch: branches/omitp, to hopefully sucker mike into helping me with some random fails (always pulsing is involved!) in general the pulsing cutover was tricky for me.
          Hide
          Michael McCandless added a comment -

          Awesome to finally have progress here!

          I agree we should cutover to an enum instead of 2 coupled booleans... I think IndexOptions is a good name.

          Show
          Michael McCandless added a comment - Awesome to finally have progress here! I agree we should cutover to an enum instead of 2 coupled booleans... I think IndexOptions is a good name.
          Hide
          Robert Muir added a comment -

          by the way, i'll look at tests tomorrow (I know the thing has bare minimum tests), and as for the enum, i don't care at all what the naming is, i just needed something.

          i simply refuse to use 2 booleans here with checks/assertions throughout the code ensuring the are in sync, i think thats really the wrong way to go.

          Show
          Robert Muir added a comment - by the way, i'll look at tests tomorrow (I know the thing has bare minimum tests), and as for the enum, i don't care at all what the naming is, i just needed something. i simply refuse to use 2 booleans here with checks/assertions throughout the code ensuring the are in sync, i think thats really the wrong way to go.
          Hide
          Robert Muir added a comment -

          This is pretty close i think, but needs some rounding out: e.g. improve checkIndex to check freqs/stats when positions are omitted, beast the tests / search the code to see if there is any more "hasProx abuse" where code assumes hasProx means no freqs, etc.

          Show
          Robert Muir added a comment - This is pretty close i think, but needs some rounding out: e.g. improve checkIndex to check freqs/stats when positions are omitted, beast the tests / search the code to see if there is any more "hasProx abuse" where code assumes hasProx means no freqs, etc.

            People

            • Assignee:
              Robert Muir
              Reporter:
              Andrzej Bialecki
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development