• Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9, 3.1, 4.0-ALPHA
    • Component/s: modules/benchmark
    • Labels:
    • Lucene Fields:
      New, Patch Available


      bzip compression can aid the benchmark package by not requiring extracting bzip files (such as enwiki) in order to index them. The plan is to add a config parameter bzip.compression=true/false and in the relevant tasks either decompress the input file or compress the output file using the bzip streams.
      It will add a dependency on ant.jar which contains two classes similar to GZIPOutputStream and GZIPInputStream which compress/decompress files using the bzip algorithm.

      bzip is known to be superior in its compression performance to the gzip algorithm (~20% better compression), although it does the compression/decompression a bit slower.

      I wil post a patch which adds this parameter and implement it in LineDocMaker, EnwikiDocMaker and WriteLineDoc task. Maybe even add the capability to DocMaker or some of the super classes, so it can be inherited by all sub-classes.

      1. commons-compress-dev20090413.jar
        137 kB
        Uwe Schindler
      2. commons-compress-dev20090413.jar
        137 kB
        Uwe Schindler
      3. LUCENE-1591.patch
        2 kB
        Mark Miller
      4. LUCENE-1591.patch
        47 kB
        Shai Erera
      5. LUCENE-1591.patch
        47 kB
        Shai Erera
      6. LUCENE-1591.patch
        45 kB
        Shai Erera
      7. LUCENE-1591.patch
        35 kB
        Shai Erera
      8. LUCENE-1591.patch
        21 kB
        Shai Erera
      9. LUCENE-1591.patch
        20 kB
        Shai Erera
      10. LUCENE-1591.patch
        15 kB
        Shai Erera



          • Assignee:
            Mark Miller
            Shai Erera
          • Votes:
            0 Vote for this issue
            0 Start watching this issue


            • Created: