Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Simple API to return number of unique terms (across all fields). Spinoff from here:

      http://www.lucidimagination.com/search/document/536b22e017be3e27/term_limit

      1. LUCENE-1586.patch
        3 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Attached patch. I plan to commit in a day or two...

        Show
        Michael McCandless added a comment - Attached patch. I plan to commit in a day or two...
        Hide
        Uwe Schindler added a comment -

        Hi Mike,
        why not just use getSequentialSubReaders() in the default implementation and recursively sum up all term counts? getSequentialSubReaders is part of the IndexReader API, so also available in the abstract class. SegmentReader can override the method and return its "real" count.
        If getSequentialSubReaders() returns null, throw the UOE.

        Show
        Uwe Schindler added a comment - Hi Mike, why not just use getSequentialSubReaders() in the default implementation and recursively sum up all term counts? getSequentialSubReaders is part of the IndexReader API, so also available in the abstract class. SegmentReader can override the method and return its "real" count. If getSequentialSubReaders() returns null, throw the UOE.
        Hide
        Uwe Schindler added a comment -

        Sorry,
        that cannot work, because the segments can share the same terms, so the sum is always > the real unique term count.

        Show
        Uwe Schindler added a comment - Sorry, that cannot work, because the segments can share the same terms, so the sum is always > the real unique term count.
        Hide
        Michael McCandless added a comment -

        Thanks Derek!

        Show
        Michael McCandless added a comment - Thanks Derek!

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development