Lucene - Core
  1. Lucene - Core
  2. LUCENE-4225

New FixedPostingsFormat for less overhead than SepPostingsFormat

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I've worked out the start at a new postings format that should have
      less overhead for fixed-int[] encoders (For,PFor)... using ideas from
      the old bulk branch, and new ideas from Robert.

      It's only a start: there's no payloads support yet, and I haven't run
      Lucene's tests with it, except for one new test I added that tries to
      be a thorough PostingsFormat tester (to make it easier to create new
      postings formats). It does pass luceneutil's performance test, so
      it's at least able to run those queries correctly...

      Like Lucene40, it uses two files (though once we add payloads it may
      be 3). The .doc file interleaves doc delta and freq blocks, and .pos
      has position delta blocks. Unlike sep, blocks are NOT shared across
      terms; instead, it uses block encoding if there are enough ints to
      encode, else the same Lucene40 vInt format. This means low-freq terms
      (< 128 = current default block size) are always vInts, and high-freq
      terms will have some number of blocks, with a vInt final block.

      Skip points are only recorded at block starts.

      1. LUCENE-4225.patch
        126 kB
        Michael McCandless
      2. LUCENE-4225.patch
        91 kB
        Michael McCandless
      3. LUCENE-4225.patch
        90 kB
        Michael McCandless
      4. LUCENE-4225.patch
        95 kB
        Michael McCandless
      5. LUCENE-4225-on-rev-1362013.patch
        93 kB
        Han Jiang

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development