Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-675

Lucene benchmark: objective performance test for Lucene

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      We need an objective way to measure the performance of Lucene, both indexing and querying, on a known corpus. This issue is intended to collect comments and patches implementing a suite of such benchmarking tests.

      Regarding the corpus: one of the widely used and freely available corpora is the original Reuters collection, available from http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.tar.gz or http://people.csail.mit.edu/u/j/jrennie/public_html/20Newsgroups/20news-18828.tar.gz. I propose to use this corpus as a base for benchmarks. The benchmarking suite could automatically retrieve it from known locations, and cache it locally.

      Attachments

        1. tiny.properties
          1.0 kB
          Doron Cohen
        2. tiny.alg
          3 kB
          Doron Cohen
        3. timedata.zip
          7 kB
          Doron Cohen
        4. taskBenchmark.zip
          65 kB
          Doron Cohen
        5. LuceneBenchmark.java
          31 kB
          Andrzej Bialecki
        6. extract_reuters.plx
          4 kB
          Marvin Humphrey
        7. byTask.jre1.4.patch.txt
          219 kB
          Doron Cohen
        8. byTask.2.patch.txt
          218 kB
          Doron Cohen
        9. benchmark.patch
          71 kB
          Grant Ingersoll
        10. benchmark.byTask.patch
          334 kB
          Doron Cohen

        Activity

          People

            gsingers Grant Ingersoll
            ab Andrzej Bialecki
            Votes:
            3 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment