Uploaded image for project: 'HttpComponents HttpClient'
  1. HttpComponents HttpClient
  2. HTTPCLIENT-2029

URIBuilder cannot parse non-UTF8 URIs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.5.10
    • 4.5.11, 5.0 Beta7
    • None
    • None

    Description

      URIBuilder always parses a given URI using UTF-8. For example given the following URI that still uses latin1:

      http://host/?x=%E4

      %E4 is an enoded "ä" character in latin1.

      new URIBuilder("http://host/?x=%E4").setCharset(ISO_8859_1).getQueryParams().get(0).getValue() outputs ""

      This is because the URIBuilder constructor already parses the given URI and the charset is at this time always null, thus UTF-8 is used.

      Proposed fix:
      Provide overloaded constructors that also allow to specify the charset; for example:

          public URIBuilder(final String string, final Charset charset) throws URISyntaxException {
              this.charset = charset;
              digestURI(new URI(string));
          }
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            tapter Matthias Keller
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: