Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2812

NPE when parsing text with write limit set on IBM JDK



    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.20
    • Fix Version/s: None
    • Component/s: core
    • Labels:
    • Environment:

      IBM JDK 8


      We have updated Tika from version 1.14 to recently released 1.20 and are now experiencing an issue with parsing of texts when write limit is set (we are using WriteOutContentHandler) on IBM JDK 8.

      Test class TikaTest.java and test file test.txt are attached.

      The issue is present on IBM JDK 8 output-ibm-jdk-tika-1.20.txt, but not on Oracle output-oracle-jdk-tika-1.20.txt or Open JDK 8 output-open-jdk-tika-1.20.txt.

      With Tika 1.14 we had no this issue output-ibm-jdk-tika-1.14.txt.

      With the fix in TIKA-2668 (https://github.com/apache/tika/commit/89a588e4d8d2aa44a9d3c965d514c18c7d3c134d#diff-5a28529cf32968d35a5036172cd8f74fL41) a line was removed from the constructor of the TaggedSAXException class:

      initCause(original); // SAXException has it's own chaining mechanism!

      Bringing the line back, solves our issue with JDK 8, but breaks the things on JDK 11 output-oracle-jdk-11-tika-1.20.txt.

      Is there any chance the class TaggedSAXException can be made compatible with JDK 8 and JDK 11 (both Oracle/OpenJDK and IBM one)?

      Thank you in advance!

      Kind regards
      Sergiy Shyrkov


        1. output-ibm-jdk-tika-1.20.txt
          0.9 kB
          Sergiy Shyrkov
        2. output-ibm-jdk-tika-1.14.txt
          0.3 kB
          Sergiy Shyrkov
        3. test.txt
          0.0 kB
          Sergiy Shyrkov
        4. output-oracle-jdk-tika-1.20.txt
          0.3 kB
          Sergiy Shyrkov
        5. output-open-jdk-tika-1.20.txt
          0.3 kB
          Sergiy Shyrkov
        6. TikaTest.java
          2 kB
          Sergiy Shyrkov
        7. output-oracle-jdk-11-tika-1.20.txt
          4 kB
          Sergiy Shyrkov



            • Assignee:
              tallison Tim Allison
              shyrkov Sergiy Shyrkov
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: