Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1690

Inconsistent (buggy) behavior when using tika-server

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.10
    • Component/s: None
    • Labels:
    • Flags:
      Important

      Description

      I am using Tika trunk (1.10-SNAPSHOT) and posting documents there. An example would be the following:

      curl -T MOD09GA.A2014010.h30v12.005.2014012183944.vegetation_fraction.tif http://localhost:9998/meta --header "Accept: application/json”

      curl -T MOD09GA.A2014010.h30v12.005.2014012183944.vegetation_fraction.tif http://localhost:9998/meta --header "Accept: application/rdf+xml”

      curl -T MOD09GA.A2014010.h30v12.005.2014012183944.vegetation_fraction.tif http://localhost:9998/meta --header "Accept: text/csv”

      I am using a python script to iterate through all the files in a folder. It works for about 50% to 80% of the files. For the rest it gives an error 500. When I post a file individually for which it previously failed (using the python script) it sometimes works. When done in an ad hoc manner, it works most of the time but fails sometimes. At times it is successful for application/rdf+xml format but fails for application/json format. The behavior is inconsistent.

      Here is an example trace of when it does not work as expected [0]
      A sample of the data being used can be found here [1]
      Any help would be appreciated.

      [0] https://paste.apache.org/lbAm

      [1] https://drive.google.com/file/d/0B6wmo4_-H0P2eWJjdTdtYS1HRGs/view?usp=sharing

        Attachments

          Activity

            People

            • Assignee:
              tallison Tim Allison
              Reporter:
              namratamalarout Namrata Malarout
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: