Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-707

IllegalArgumentException Parsing MS Word 97 - 2003

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.10
    • 0.10
    • parser
    • None

    Description

      http://www.ac-nancy-metz.fr/enseign/physique/nouvcoll/4-matiere/Exemple%20s%C3%A9ance%20TIC%20et%20Prisme.doc

      Caused by: java.lang.IllegalArgumentException: charStart (3102) > charEnd (3091)
      at org.apache.poi.hwpf.model.BytePropertyNode.<init>(BytePropertyNode.java:61)
      at org.apache.poi.hwpf.model.CHPX.<init>(CHPX.java:53)
      at org.apache.poi.hwpf.model.CHPFormattedDiskPage.<init>(CHPFormattedDiskPage.java:91)
      at org.apache.poi.hwpf.model.CHPBinTable.<init>(CHPBinTable.java:101)
      at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:280)
      at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:67)
      at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:196)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
      ... 41 more

      Attachments

        Activity

          People

            Unassigned Unassigned
            pqueixalos Pablo Queixalos
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: