Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1620

OUTPUT_FILE_TOKEN not being replaced in ExternalParser

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.7, 1.8
    • 1.9
    • parser
    • None
    • Any.

    Description

      According to class documentation, the org.apache.tika.parser.external.ExternalParser class should replace the OUTPUT_FILE_TOKEN constant with an output file name when specified as a command argument. It is currently not the case and the parser will fail to grab any output from processes generating output files.

      In order to fix this, you should add one line to the following code in the parse method (starting on line 168):

      if(cmd[i].indexOf(OUTPUT_FILE_TOKEN) != -1) {
          output = tmp.createTemporaryFile();
          outputFromStdOut = false;
          //START FIX:
          cmd[i] = cmd[i].replace(OUTPUT_FILE_TOKEN, output.getPath());
          //END FIX.
      }
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pascal.essiembre Pascal Essiembre
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 5m
                  5m
                  Remaining:
                  Remaining Estimate - 5m
                  5m
                  Logged:
                  Time Spent - Not Specified
                  Not Specified