Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 4.0-ALPHA
    • 3.5, 4.0-ALPHA
    • core/index
    • None
    • New

    Description

      There are various places in Lucene that could take advantage of an
      efficient packed unsigned int/long impl. EG the terms dict index in
      the standard codec in LUCENE-1458 could subsantially reduce it's RAM
      usage. FieldCache.StringIndex could as well. And I think "load into
      RAM" codecs like the one in TestExternalCodecs could use this too.

      I'm picturing something very basic like:

      interface PackedUnsignedLongs  {
        long get(long index);
        void set(long index, long value);
      }
      

      Plus maybe an iterator for getting and maybe also for setting. If it
      helps, most of the usages of this inside Lucene will be "write once"
      so eg the set could make that an assumption/requirement.

      And a factory somewhere:

        PackedUnsignedLongs create(int count, long maxValue);
      

      I think we should simply autogen the code (we can start from the
      autogen code in LUCENE-1410), or, if there is an good existing impl
      that has a compatible license that'd be great.

      I don't have time near-term to do this... so if anyone has the itch,
      please jump!

      Attachments

        1. LUCENE-1990_PerformanceMeasurements20100104.zip
          18 kB
          Toke Eskildsen
        2. LUCENE-1990.patch
          251 kB
          Michael McCandless
        3. LUCENE-1990-te20100122.patch
          281 kB
          Toke Eskildsen
        4. LUCENE-1990-te20100210.patch
          71 kB
          Toke Eskildsen
        5. LUCENE-1990-te20100212.patch
          99 kB
          Toke Eskildsen
        6. LUCENE-1990-te20100223.patch
          103 kB
          Toke Eskildsen
        7. LUCENE-1990-te20100226.patch
          130 kB
          Toke Eskildsen
        8. LUCENE-1990-te20100226b.patch
          134 kB
          Toke Eskildsen
        9. performance-te20100226.txt
          33 kB
          Toke Eskildsen
        10. LUCENE-1990-te20100226c.patch
          356 kB
          Toke Eskildsen
        11. generated_performance-te20100226.txt
          23 kB
          Toke Eskildsen
        12. perf-mkm-20100227.txt
          7 kB
          Michael McCandless
        13. performance-20100301.txt
          45 kB
          Toke Eskildsen
        14. LUCENE-1990-te20100301.patch
          364 kB
          Toke Eskildsen
        15. LUCENE-1990.patch
          73 kB
          Michael McCandless
        16. LUCENE-1990.patch
          4 kB
          Michael McCandless

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mikemccand Michael McCandless
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment