Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6841

LZ4 compression using too much CPU time

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 5.3.1
    • None
    • core/codecs
    • None
    • Linux, Java 8

    • New

    Description

      I am using Lucene for search indexing, including storing a large number of small fields, and some larger plain text fields, and searching using both exact matches and analyzed queries.

      LZ4 (specifically the decompress method) is using nearly exactly 50% of the application's CPU time.

      It seems to me that LZ4 is inappropriate for my use case. I note that I can choose BEST_SPEED or BEST_COMPRESSION.

      Would it be palatable to add a NO_COMPRESSION option, or some way to pick and choose which fields get compressed? Perhaps a minimum length of a field could be specified before it's compressed? I'm not sure if that's possible.

      If this approach, or similar is palatable, I would be happy to contribute a patch (or to consume and test a patch).

      Attachments

        Activity

          People

            Unassigned Unassigned
            karlvr Karl von Randow
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: