Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3195

Inconsistent result of tika.detect(InputStream) and tika.detect(TikaInputStream)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.24.1
    • None
    • detector
    • None

    Description

      When we tried to detect ogg video, samples can be found from  https://filesamples.com/formats/ogv

      We noticed that tika will return different result when detect:

      Tika tika = new Tika();
      try(InputStream inputStream = Main.class.getResourceAsStream("/sample_1280x720.ogv");
          TikaInputStream tikaInputStream = TikaInputStream.get(inputStream)) {
          String mimeType1 = tika.detect(tikaInputStream);
          System.out.println(mimeType1);
      }
      # output: video/theora
      
      Path path = Paths.get(Main.class.getResource("/sample_1280x720.ogv").toURI());
      String mimeType2 = tika.detect(path);
      System.out.println(mimeType2);
      # output: video/theora 
      
      try(InputStream inputStream = Main.class.getResourceAsStream("/sample_1280x720.ogv")) {
          String mimeType3 = tika.detect(inputStream);
          System.out.println(mimeType3);
      }
      # output: application/ogg

      The result which takes in the inputStream is different from others. 

      Is this the expected behavior?

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            xj xiaojie
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: