Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-675

Lucene benchmark: objective performance test for Lucene

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      We need an objective way to measure the performance of Lucene, both indexing and querying, on a known corpus. This issue is intended to collect comments and patches implementing a suite of such benchmarking tests.

      Regarding the corpus: one of the widely used and freely available corpora is the original Reuters collection, available from http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.tar.gz or http://people.csail.mit.edu/u/j/jrennie/public_html/20Newsgroups/20news-18828.tar.gz. I propose to use this corpus as a base for benchmarks. The benchmarking suite could automatically retrieve it from known locations, and cache it locally.

      Attachments

        1. benchmark.byTask.patch
          334 kB
          Doron Cohen
        2. benchmark.patch
          71 kB
          Grant Ingersoll
        3. byTask.2.patch.txt
          218 kB
          Doron Cohen
        4. byTask.jre1.4.patch.txt
          219 kB
          Doron Cohen
        5. extract_reuters.plx
          4 kB
          Marvin Humphrey
        6. LuceneBenchmark.java
          31 kB
          Andrzej Bialecki
        7. taskBenchmark.zip
          65 kB
          Doron Cohen
        8. timedata.zip
          7 kB
          Doron Cohen
        9. tiny.alg
          3 kB
          Doron Cohen
        10. tiny.properties
          1.0 kB
          Doron Cohen

        Activity

          People

            gsingers Grant Ingersoll
            ab Andrzej Bialecki
            Votes:
            3 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: