Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6814

PatternTokenizer indefinitely holds heap equal to max field it has ever tokenized

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 5.4, 6.0
    • None
    • None
    • New

    Description

      Caught by Alex Chow in this Elasticsearch issue: https://github.com/elastic/elasticsearch/issues/13721

      Today, PatternTokenizer reuses a single StringBuilder, but it doesn't free its heap usage after tokenizing is done. We can either stop reusing, or ask it to .trimToSize when we are done ...

      Attachments

        1. LUCENE-6814.patch
          1 kB
          Michael McCandless
        2. LUCENE-6814.patch
          3 kB
          Michael McCandless

        Activity

          People

            mikemccand Michael McCandless
            mikemccand Michael McCandless
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: