Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-2097

Treatment of URLs with blanks is incorrect in some places

    XMLWordPrintableJSON

Details

    Description

      User reported Document Analyzer failing to write correct names in the output directory, when the input directory had a blank in it (on Windows). Traced this to failing URL handling.

      Proper URL handling seems to need to observe these principles:

      1) URL may have "blanks" and other "invalid-in-URI chars"
      2) URL may have %20 style encoding of blanks and other need-to-be-escaped characters

      Creating files from these: need to use File(a-uri-form).

      Creating URIs from URLs - if the URL has unescaped blanks, etc., the form
      new URI(aUrl) fails; the aUrl.toURI() fails (i.e., throws an exception due to illegal chars for URI).

      To make the URI creation put in the escape chars if they're not there, you need to use the multi-arg form of new URI (see Javadocs).

      Fix the patches in UIMA-1879 and UIMA-1748.

      Attachments

        Issue Links

          Activity

            People

              schor Marshall Schor
              schor Marshall Schor
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: