Tika
  1. Tika
  2. TIKA-586

Parsing a ms access file (*.mdb) throws an error

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.8
    • Fix Version/s: 0.9
    • Component/s: mime
    • Labels:
      None

      Description

      I know that parsing a ms access file (*.mdb) is not supported since there is not parser for it, but I think it should not throw an exception.
      Currently when parsing a mdb file it is being recognized as a true font file. The TrueTypeParser throws an parser specific error when encountering a mdb file.

      Stacktrace:
      Exception in thread "main" org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.font.TrueTypeParser@6906daba
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:203)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
      at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:94)
      at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:273)
      at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:80)
      Caused by: java.io.IOException: Unexpected end of TTF stream reached
      at org.apache.fontbox.ttf.TTFDataStream.read(TTFDataStream.java:217)
      at org.apache.fontbox.ttf.TTFDataStream.readString(TTFDataStream.java:69)
      at org.apache.fontbox.ttf.TTFDataStream.readString(TTFDataStream.java:57)
      at org.apache.fontbox.ttf.AbstractTTFParser.readTableDirectory(AbstractTTFParser.java:214)
      at org.apache.fontbox.ttf.AbstractTTFParser.parseTTF(AbstractTTFParser.java:85)
      at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:26)
      at org.apache.fontbox.ttf.AbstractTTFParser.parseTTF(AbstractTTFParser.java:66)
      at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:26)
      at org.apache.tika.parser.font.TrueTypeParser.parse(TrueTypeParser.java:63)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
      ... 5 more

      1. accessdb.mdb
        108 kB
        Martijn van Groningen
      2. TIKA-586.patch
        1 kB
        Martijn van Groningen
      3. true-font.ttf
        96 kB
        Martijn van Groningen

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Unassigned
              Reporter:
              Martijn van Groningen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development