Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10013

Document contains at least one immense term in field (whose UTF8 encoding is longer than the max length 32766)

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Done
    • 4.10.4
    • None
    • None
    • None
    • New

    Description

      Hi Team,

      Currently we are using Lucene 4.10.4 version. We are getting the below error:

      "Document contains at least one immense term in field (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[-41, -103, -41, -87, -41, -103, -41, -111, -41, -108, 32, 56, 56, 45, -41, -108, -41, -111, -41, -107, -41, -89, -41, -88, 44, 32, 40, 32, 49, 51]...', original message: bytes can be at most 32766 in length; got 35169".

      We understand from the Lucene JIRA ticketĀ LUCENE-5472 Long terms should generate a RuntimeException, not just infoStream - ASF JIRA (apache.org), this issue has been resolved in 4.8 and 6.0.

      Please confirm us if this fix is included in 4.10.4.

      Attachments

        Activity

          People

            Unassigned Unassigned
            arvindkr Arvind Kumar Sahu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: