Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1184

Infinite halt on parsing old files (e.g. mp3, ms-dos drivers, ...)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4
    • None
    • cli, parser
    • None

    Description

      tika hangs on identifying several types of files. the following example is an mp3 file with corrupt metadata. other filetypes which have the same problem are for example MSDOS device drivers (*.sys)
      i am not into java programming, but my guess would be, that tika is trying to seek() within a file and the target position is greater than filesize.

      > java -jar tika-app-1.4.jar -m /u01/fk/xd/2/c/16866bc96e6a316d8cbdbd7ca2ce1e
      [hangs forever without error message]

      ffmpeg gives some warnings about duration errors...
      > ffmpeg -i /u01/fk/xd/2/c/16866bc96e6a316d8cbdbd7ca2ce1e
      [mp3 @ 0x633240] max_analyze_duration 5000000 reached at 5015510
      [mp3 @ 0x633240] Estimating duration from bitrate, this may be inaccurate
      Input #0, mp3, from '/u01/fk/xd/2/c/16866bc96e6a316d8cbdbd7ca2ce1e':
      Metadata:
      artist :
      album :
      Duration: 00:15:29.10, start: 0.000000, bitrate: 192 kb/s
      Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16, 192 kb/s

      Attachments

        1. ansi.sys
          9 kB
          Giuseppe Totaro

        Activity

          People

            Unassigned Unassigned
            enge Jürgen Enge
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: