Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-1878

TikaAnnotator doesn't handle spaces in path string

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.3
    • 2.3.1Addons
    • Sandbox-TikaAnnotator
    • None
    • Windows

    Description

      If you give a value for InputDirectory that contains a space, then TikiAnnotator silently does nothing.

      This is because File objects are converted directly to a URL, and openStream() fails because the space character wasn't converted to %20.

      When this happens, the exception is ignored and the CAS text is set to "".

      It would be better to convert the File object to a URI and the URI to a URL. This will convert the space character correctly.

      Secondly, it would be better the throw an exception rather than silently ignore it.

      A suggested patch is attached.

      Attachments

        1. TikaAnnotator-patch.txt
          3 kB
          Adam Holmberg

        Activity

          People

            Unassigned Unassigned
            holmberg Adam Holmberg
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: