Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3511

Metadata xmpDM:duration returing a negative value for long MP4 file.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.25
    • None
    • parser
    • None

    Description

      For a large mp4 file (~10GB) with a duration of a little over 10 hours, the metadata returned from autoDetectParser.parse() had a negative value for xmpDM:duration. Since the duration is in seconds, I wondered if a Short was being used rather than an Integer somewhere in the process and it was being overflowed? Just a theory.

       

      I'll paste our TikaConfig below in case it's relevant, but doesn't look like there's much customization to it.

       

      <?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?><properties> <service-loader dynamic="false" loadErrorHandler="IGNORE" initializableProblemHandler="IGNORE"/> <encodingDetectors> <encodingDetector class="org.apache.tika.parser.html.HtmlEncodingDetector"/> <encodingDetector class="org.apache.tika.parser.txt.UniversalEncodingDetector"/> <encodingDetector class="org.apache.tika.parser.txt.Icu4jEncodingDetector"/> </encodingDetectors> <detectors> <detector class="org.apache.tika.detect.OverrideDetector"/> <detector class="org.apache.tika.parser.microsoft.POIFSContainerDetector"/> <detector class="org.apache.tika.parser.pkg.ZipContainerDetector"/> <detector class="org.gagravarr.tika.OggDetector"/> <detector class="org.apache.tika.mime.MimeTypes"/> </detectors> <parsers> <parser class="org.apache.tika.parser.apple.AppleSingleFileParser"/> <parser class="org.apache.tika.parser.asm.ClassParser"/> <parser class="org.apache.tika.parser.audio.AudioParser"/> <parser class="org.apache.tika.parser.audio.MidiParser"/> <parser class="org.apache.tika.parser.chm.ChmParser"/> <parser class="org.apache.tika.parser.code.SourceCodeParser"/> <parser class="org.apache.tika.parser.crypto.Pkcs7Parser"/> <parser class="org.apache.tika.parser.crypto.TSDParser"/> <parser class="org.apache.tika.parser.csv.TextAndCSVParser"/> <parser class="org.apache.tika.parser.dbf.DBFParser"/> <parser class="org.apache.tika.parser.dif.DIFParser"/> <parser class="org.apache.tika.parser.dwg.DWGParser"/> <parser class="org.apache.tika.parser.epub.EpubParser"/> <parser class="org.apache.tika.parser.executable.ExecutableParser"/> <parser class="org.apache.tika.parser.feed.FeedParser"/> <parser class="org.apache.tika.parser.font.AdobeFontMetricParser"/> <parser class="org.apache.tika.parser.font.TrueTypeParser"/> <parser class="org.apache.tika.parser.gdal.GDALParser"/> <parser class="org.apache.tika.parser.geoinfo.GeographicInformationParser"/> <parser class="org.apache.tika.parser.grib.GribParser"/> <parser class="org.apache.tika.parser.hdf.HDFParser"/> <parser class="org.apache.tika.parser.html.HtmlParser"/> <parser class="org.apache.tika.parser.hwp.HwpV5Parser"/> <parser class="org.apache.tika.parser.image.BPGParser"/> <parser class="org.apache.tika.parser.image.ICNSParser"/> <parser class="org.apache.tika.parser.image.ImageParser"/> <parser class="org.apache.tika.parser.image.PSDParser"/> <parser class="org.apache.tika.parser.image.TiffParser"/> <parser class="org.apache.tika.parser.image.WebPParser"/> <parser class="org.apache.tika.parser.iptc.IptcAnpaParser"/> <parser class="org.apache.tika.parser.isatab.ISArchiveParser"/> <parser class="org.apache.tika.parser.iwork.IWorkPackageParser"/> <parser class="org.apache.tika.parser.jdbc.SQLite3Parser"/> <parser class="org.apache.tika.parser.jpeg.JpegParser"/> <parser class="org.apache.tika.parser.mail.RFC822Parser"/> <parser class="org.apache.tika.parser.mat.MatParser"/> <parser class="org.apache.tika.parser.mbox.MboxParser"/> <parser class="org.apache.tika.parser.mbox.OutlookPSTParser"/> <parser class="org.apache.tika.parser.microsoft.EMFParser"/> <parser class="org.apache.tika.parser.microsoft.JackcessParser"/> <parser class="org.apache.tika.parser.microsoft.MSOwnerFileParser"/> <parser class="org.apache.tika.parser.microsoft.OfficeParser"> <params> <param name="byteArrayMaxOverride" type="int">20000000</param> </params> </parser> <parser class="org.apache.tika.parser.microsoft.OldExcelParser"/> <parser class="org.apache.tika.parser.microsoft.TNEFParser"/> <parser class="org.apache.tika.parser.microsoft.WMFParser"/> <parser class="org.apache.tika.parser.microsoft.ooxml.OOXMLParser"/> <parser class="org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser"/> <parser class="org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser"/> <parser class="org.apache.tika.parser.microsoft.xml.WordMLParser"/> <parser class="org.apache.tika.parser.mp3.Mp3Parser"/> <parser class="org.apache.tika.parser.mp4.MP4Parser"/> <parser class="org.apache.tika.parser.netcdf.NetCDFParser"/> <parser class="org.apache.tika.parser.odf.OpenDocumentParser"/> <parser class="org.apache.tika.parser.pdf.PDFParser"/> <parser class="org.apache.tika.parser.pkg.CompressorParser"/> <parser class="org.apache.tika.parser.pkg.PackageParser"/> <parser class="org.apache.tika.parser.pkg.RarParser"/> <parser class="org.apache.tika.parser.rtf.RTFParser"/> <parser class="org.apache.tika.parser.sas.SAS7BDATParser"/> <parser class="org.apache.tika.parser.video.FLVParser"/> <parser class="org.apache.tika.parser.wordperfect.QuattroProParser"/> <parser class="org.apache.tika.parser.wordperfect.WordPerfectParser"/> <parser class="org.apache.tika.parser.xliff.XLIFF12Parser"/> <parser class="org.apache.tika.parser.xliff.XLZParser"/> <parser class="org.apache.tika.parser.xml.DcXMLParser"/> <parser class="org.apache.tika.parser.xml.FictionBookParser"/> <parser class="org.gagravarr.tika.FlacParser"/> <parser class="org.gagravarr.tika.OggParser"/> <parser class="org.gagravarr.tika.OpusParser"/> <parser class="org.gagravarr.tika.SpeexParser"/> <parser class="org.gagravarr.tika.TheoraParser"/> <parser class="org.gagravarr.tika.VorbisParser"/>  </parsers></properties>

      Attachments

        Activity

          People

            Unassigned Unassigned
            cpostlethwait Caleb Postlethwait
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: