Lucene - Core
  1. Lucene - Core
  2. LUCENE-2393

Utility to output total term frequency and df from a lucene index

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: modules/other
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      This is a pair of command line utilities that provide information on the total number of occurrences of a term in a Lucene index. The first takes a field name, term, and index directory and outputs the document frequency for the term and the total number of occurrences of the term in the index (i.e. the sum of the tf of the term for each document). The second reads the index to determine the top N most frequent terms (by document frequency) and then outputs a list of those terms along with the document frequency and the total number of occurrences of the term. Both utilities are useful for estimating the size of the term's entry in the *prx files and consequent Disk I/O demands.

      1. ASF.LICENSE.NOT.GRANTED--LUCENE-2393.patch
        12 kB
        Tom Burton-West
      2. ASF.LICENSE.NOT.GRANTED--LUCENE-2393.patch
        11 kB
        Tom Burton-West
      3. ASF.LICENSE.NOT.GRANTED--LUCENE-2393.patch
        4 kB
        Tom Burton-West
      4. LUCENE-2393.patch
        22 kB
        Tom Burton-West
      5. LUCENE-2393.patch
        17 kB
        Michael McCandless
      6. LUCENE-2393.patch
        21 kB
        Tom Burton-West
      7. LUCENE-2393.patch
        12 kB
        Tom Burton-West
      8. LUCENE-2393-3x.patch
        23 kB
        Michael McCandless
      9. LUCENE-2393-3xbranch.patch
        22 kB
        Tom Burton-West

        Activity

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Tom Burton-West
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development