Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Even if it is not optimal, I think it would help to create a Lucene70DocValuesFormat now by copying the current Lucene54DocValuesFormat and including some minor changes like making the sparse case use a true iterator API as described in LUCENE-7457 (which should make it to Lucene54DocValuesFormat a well so that merging from an old codec would be efficient) as well as raising the threshold to enable sparse encoding and using nextSetBit operations when iterating bit sets, which cannot be done easily in Lucene54DocValuesFormat because we'd need to add a couple trailing bytes to make sure we can read a long at any valid index.

      1. LUCENE-7463.patch
        172 kB
        Adrien Grand

        Issue Links

          Activity

          Hide
          jpountz Adrien Grand added a comment -

          Here is the commit https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=32446e9 (it did not link here since I did a mistake in the commit message).

          Show
          jpountz Adrien Grand added a comment - Here is the commit https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=32446e9 (it did not link here since I did a mistake in the commit message).
          Hide
          mikemccand Michael McCandless added a comment -

          +1

          Show
          mikemccand Michael McCandless added a comment - +1
          Hide
          jpountz Adrien Grand added a comment -

          Here is a patch that adds a Lucene70Codec and a Lucene70DocValuesFormat. The latter is mostly the same as Lucene54DocValuesFormat, the two differences are that it uses nextSetBit operations on the bitset representing live docs in order to have faster iteration (instead of testing each bit sequentially), and that it bumps the threshold for sparse encoding to 10% instead of 1%. The goal is not really to make it the final 7.0 codec but rather to have something that we will be able to compare next iterations with.

          Show
          jpountz Adrien Grand added a comment - Here is a patch that adds a Lucene70Codec and a Lucene70DocValuesFormat. The latter is mostly the same as Lucene54DocValuesFormat, the two differences are that it uses nextSetBit operations on the bitset representing live docs in order to have faster iteration (instead of testing each bit sequentially), and that it bumps the threshold for sparse encoding to 10% instead of 1%. The goal is not really to make it the final 7.0 codec but rather to have something that we will be able to compare next iterations with.

            People

            • Assignee:
              Unassigned
              Reporter:
              jpountz Adrien Grand
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development