Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2128

IllegalArgumentException on a valid Word file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.13
    • None
    • parser
    • None
    • Windows 7 x64, JVM 1.8.0_101

    Description

      The following valid Word file:

      https://dl.dropboxusercontent.com/u/92341073/VTEU_ICPD_Bacteremia_Concept_Submitted_23Jul08.doc

      when parsed by Tika, throws the following error:

      java.lang.IllegalArgumentException: This paragraph is not the first one in the table
      at org.apache.poi.hwpf.usermodel.Range.getTable(Range.java:925)
      at org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:241)
      at org.apache.tika.parser.microsoft.WordExtractor.handleHeaderFooter(WordExtractor.java:227)
      at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:162)
      at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:146)
      at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sevaa Seva Alekseyev
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: