Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-621

RTF parsing fails with Java 7 early access on 64bit platforms

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8, 0.9
    • 0.10
    • parser
    • None
    • $ java -version
      java version "1.7.0-ea"
      Java(TM) SE Runtime Environment (build 1.7.0-ea-b134)
      Java HotSpot(TM) 64-Bit Server VM (build 21.0-b04, mixed mode)

      (Seen using this version of Java on both Windows 2008 and CentOS 5)

    Description

      I've run across an RTF documents which tika is failing to convert on 64bit platforms (Windows and Linux) using the Java 7 early access version. The same document is successfully converted on 32bit Windows and Linux, and using Java 6.

      java -jar tika-app-0.9.jar -t full.rtf 
      Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.rtf.RTFParser@1fa78298
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
      	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
      	at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:107)
      	at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:302)
      	at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:91)
      Caused by: java.lang.NullPointerException
      	at javax.swing.text.GapContent.compare(Unknown Source)
      	at javax.swing.text.GapContent.findSortIndex(Unknown Source)
      	at javax.swing.text.GapContent.createPosition(Unknown Source)
      	at javax.swing.text.AbstractDocument.createPosition(Unknown Source)
      	at javax.swing.text.AbstractDocument$LeafElement.<init>(Unknown Source)
      	at javax.swing.text.AbstractDocument.createLeafElement(Unknown Source)
      	at javax.swing.text.DefaultStyledDocument$ElementBuffer.insertElement(Unknown Source)
      	at javax.swing.text.DefaultStyledDocument$ElementBuffer.insertUpdate(Unknown Source)
      	at javax.swing.text.DefaultStyledDocument$ElementBuffer.insert(Unknown Source)
      	at javax.swing.text.DefaultStyledDocument.insertUpdate(Unknown Source)
      	at javax.swing.text.AbstractDocument.handleInsertString(Unknown Source)
      	at javax.swing.text.AbstractDocument.insertString(Unknown Source)
      	at org.apache.tika.parser.rtf.RTFParser$CustomStyledDocument.insertString(RTFParser.java:376)
      	at javax.swing.text.rtf.RTFReader$DocumentDestination.deliverText(Unknown Source)
      	at javax.swing.text.rtf.RTFReader$TextHandlingDestination.handleText(Unknown Source)
      	at javax.swing.text.rtf.RTFReader.handleText(Unknown Source)
      	at javax.swing.text.rtf.RTFParser.write(Unknown Source)
      	at javax.swing.text.rtf.AbstractFilter.write(Unknown Source)
      	at javax.swing.text.rtf.AbstractFilter.readFromStream(Unknown Source)
      	at javax.swing.text.rtf.RTFEditorKit.read(Unknown Source)
      	at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:112)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
      	... 5 more
      

      The document in question is available at http://public.funnelback.com/full.rtf

      Attachments

        Activity

          People

            jukkaz Jukka Zitting
            mattsheppard Matt Sheppard
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: