Uploaded image for project: 'HttpComponents HttpClient'
  1. HttpComponents HttpClient
  2. HTTPCLIENT-293

Provide support for non-ASCII charsets in the multipart disposition-content header

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0 Alpha
    • Fix Version/s: 4.0 Beta 2
    • Component/s: HttpClient (classic)
    • Labels:
      None
    • Environment:
      Operating System: All
      Platform: All
    • Bugzilla Id:
      24504

      Description

      Because of the the following line in getAsciiBytes
      data.getBytes("US-ASCII");

      The returned string is modified if has Latin Characters.

      Ex : Document non-controlé -> Document non-control?

        Activity

        Hide
        olegk Oleg Kalnichevski added a comment -

        Eric,
        My apologies, but I do not quite understand the nature of the problem. What do
        you mean by 'cannot create a document'? What do you mean by a document in the
        first place? Request content body? Response content body?

        what version of HttpClient are you using and what is it you are trying to get done?

        As to getAsciiBytes method, as its name implies it is supposed to return ASCII
        characters only. So, the behaviour of the method is correct.

        You might want to have a look at the HttpClient character encoding guide for
        more details:

        http://jakarta.apache.org/commons/httpclient/charencodings.html

        I'll have no choice but to mark the report as invalid unless more information is
        given

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - Eric, My apologies, but I do not quite understand the nature of the problem. What do you mean by 'cannot create a document'? What do you mean by a document in the first place? Request content body? Response content body? what version of HttpClient are you using and what is it you are trying to get done? As to getAsciiBytes method, as its name implies it is supposed to return ASCII characters only. So, the behaviour of the method is correct. You might want to have a look at the HttpClient character encoding guide for more details: http://jakarta.apache.org/commons/httpclient/charencodings.html I'll have no choice but to mark the report as invalid unless more information is given Oleg
        Hide
        ewrickspm@yahoo.com Eric Dofonsou added a comment -

        My fault, by document I was refering to file (physical file onthe hard drive)
        ie : c:\work\DocumentDeTèst.txt <-- This filename has an accent.

        I am using the latest version : 2.0 Rc2

        As to getAsciiBytes method, as its name implies it is supposed to return ASCII
        characters only. So, the behaviour of the method is correct.

        Precisly, but because of that the accent based charaters are converted to ?
        ie : c:\work\DocumentDeTèst.txt --> c:\work\DocumentDeT?st.txt

        Show
        ewrickspm@yahoo.com Eric Dofonsou added a comment - My fault, by document I was refering to file (physical file onthe hard drive) ie : c:\work\DocumentDeTèst.txt <-- This filename has an accent. I am using the latest version : 2.0 Rc2 As to getAsciiBytes method, as its name implies it is supposed to return ASCII characters only. So, the behaviour of the method is correct. Precisly, but because of that the accent based charaters are converted to ? ie : c:\work\DocumentDeTèst.txt --> c:\work\DocumentDeT?st.txt
        Hide
        olegk Oleg Kalnichevski added a comment -

        Eric,
        Are you using MultipartPostMethod by any chance? Please give me a bit more
        details about what your application is supposed to do and what you are trying to
        accomplish, so I would not have to play a private detective.

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - Eric, Are you using MultipartPostMethod by any chance? Please give me a bit more details about what your application is supposed to do and what you are trying to accomplish, so I would not have to play a private detective. Oleg
        Hide
        ewrickspm@yahoo.com Eric Dofonsou added a comment -

        Hi Oleg.

        Yes, I'am using a multipart post.

        In our application we want to upload files to a file server from a java
        application via HTTP. We use multipart because we have to include extra
        information for the server application to be able to handle the data (ie : link
        the file to a database object etc ...). We also want to be able to upload
        multiple files (wichi works well as long as we have no accent in the filenames)


        Here is the code that buids the file parts

        HttpClient client = new HttpClient();
        MultipartPostMethod httpsPost = new MultipartPostMethod ( m_docServer );

        //Set header information
        httpsPost.setRequestHeader("Content-Type", "multipart/form-data;
        boundary="+BOUNDS);

        //Adding the main parts.
        StringPart partToAdd = new StringPart("ClassUID", classUID);
        partToAdd.setTransferEncoding(null);
        partToAdd.setContentType(null);
        httpsPost.addPart( partToAdd );

        partToAdd = new StringPart("MethodName", methodName);
        partToAdd.setTransferEncoding(null);
        partToAdd.setContentType(null);
        httpsPost.addPart( partToAdd );

        partToAdd = new StringPart("Params", params);
        partToAdd.setTransferEncoding(null);
        partToAdd.setContentType(null);
        httpsPost.addPart( partToAdd );

        //Adding teh files parts.
        int i=0;
        Iterator iterator = parts.keySet().iterator();
        AI_DOCPART part;
        String partID;
        String partFile;
        FilePart fPart;

        //loop until we have created all file parts.
        while(iterator.hasNext()){
        part = (AI_DOCPART)(iterator.next());
        partID = part.getIDAsString();
        partFile = (String) parts.get(part);
        try

        { fPart = new FilePart("FILE"+(i+1), new File(partFile)); //partToAdd.setContentType(null); //partToAdd.setTransferEncoding( null ); httpsPost.addPart(fPart); }

        catch (FileNotFoundException e)

        { throw new AIException("ERR_INVALIDE_FILENAME","",GUIMediator.getStringResource ("Corporate","ERR_INVALIDE_FILENAME"),""); }

        partToAdd = new StringPart("PARTNUMBER"+(i+1) , partID);
        partToAdd.setContentType(null);
        partToAdd.setTransferEncoding( null );
        httpsPost.addPart( partToAdd );
        i++;
        }

        //Set timeout in Milliseconds -> 30 secondes
        client.setConnectionTimeout( 30000 );

        //Send the data
        int status=0;
        try {
        status = client.executeMethod(httpsPost);
        }
        ...

        Show
        ewrickspm@yahoo.com Eric Dofonsou added a comment - Hi Oleg. Yes, I'am using a multipart post. In our application we want to upload files to a file server from a java application via HTTP. We use multipart because we have to include extra information for the server application to be able to handle the data (ie : link the file to a database object etc ...). We also want to be able to upload multiple files (wichi works well as long as we have no accent in the filenames) – Here is the code that buids the file parts HttpClient client = new HttpClient(); MultipartPostMethod httpsPost = new MultipartPostMethod ( m_docServer ); //Set header information httpsPost.setRequestHeader("Content-Type", "multipart/form-data; boundary="+BOUNDS); //Adding the main parts. StringPart partToAdd = new StringPart("ClassUID", classUID); partToAdd.setTransferEncoding(null); partToAdd.setContentType(null); httpsPost.addPart( partToAdd ); partToAdd = new StringPart("MethodName", methodName); partToAdd.setTransferEncoding(null); partToAdd.setContentType(null); httpsPost.addPart( partToAdd ); partToAdd = new StringPart("Params", params); partToAdd.setTransferEncoding(null); partToAdd.setContentType(null); httpsPost.addPart( partToAdd ); //Adding teh files parts. int i=0; Iterator iterator = parts.keySet().iterator(); AI_DOCPART part; String partID; String partFile; FilePart fPart; //loop until we have created all file parts. while(iterator.hasNext()){ part = (AI_DOCPART)(iterator.next()); partID = part.getIDAsString(); partFile = (String) parts.get(part); try { fPart = new FilePart("FILE"+(i+1), new File(partFile)); //partToAdd.setContentType(null); //partToAdd.setTransferEncoding( null ); httpsPost.addPart(fPart); } catch (FileNotFoundException e) { throw new AIException("ERR_INVALIDE_FILENAME","",GUIMediator.getStringResource ("Corporate","ERR_INVALIDE_FILENAME"),""); } partToAdd = new StringPart("PARTNUMBER"+(i+1) , partID); partToAdd.setContentType(null); partToAdd.setTransferEncoding( null ); httpsPost.addPart( partToAdd ); i++; } //Set timeout in Milliseconds -> 30 secondes client.setConnectionTimeout( 30000 ); //Send the data int status=0; try { status = client.executeMethod(httpsPost); } ...
        Hide
        olegk Oleg Kalnichevski added a comment -

        Form-based File Upload in HTML specification (RFC 1867)
        <http://www.ietf.org/rfc/rfc1867.txt> that HttpClient implements follows the
        rules of all multipart MIME data streams as outlined in RFC 1521 and RFC 1522.
        MIME specification requires all non-ASCII content to be represented using ASCII
        charset only. Currently HttpClient does not perform such translation
        automatically. You will have to take care of filename encoding prior to passing
        it to the FilePart as a parameter.

        I was going to contribute quote-printable encoder/decoder to the Commons Codec
        library but never got a chance.

        To sum things up: if the relevant RFCs are to be strictly adhered to, the
        behaviour on the part of HttpClient is correct. However, I do agree that it
        would be nice if HttpClient took care of non-ASCII charset translation
        automatically. So, feel free to reopen this bug as a feature request.

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - Form-based File Upload in HTML specification (RFC 1867) < http://www.ietf.org/rfc/rfc1867.txt > that HttpClient implements follows the rules of all multipart MIME data streams as outlined in RFC 1521 and RFC 1522. MIME specification requires all non-ASCII content to be represented using ASCII charset only. Currently HttpClient does not perform such translation automatically. You will have to take care of filename encoding prior to passing it to the FilePart as a parameter. I was going to contribute quote-printable encoder/decoder to the Commons Codec library but never got a chance. To sum things up: if the relevant RFCs are to be strictly adhered to, the behaviour on the part of HttpClient is correct. However, I do agree that it would be nice if HttpClient took care of non-ASCII charset translation automatically. So, feel free to reopen this bug as a feature request. Oleg
        Hide
        olegk Oleg Kalnichevski added a comment -

        Re-opened as a feature request

        Show
        olegk Oleg Kalnichevski added a comment - Re-opened as a feature request
        Hide
        olegk Oleg Kalnichevski added a comment -
        Show
        olegk Oleg Kalnichevski added a comment - HTTPCLIENT-368 has been marked as a duplicate of this bug. ***
        Hide
        labaere Francis Labaere added a comment -

        I just wanted to add some interesting RFC for this feature request:

        RFC 2231
        RFC 2047
        RFC 2184

        Show
        labaere Francis Labaere added a comment - I just wanted to add some interesting RFC for this feature request: RFC 2231 RFC 2047 RFC 2184
        Hide
        ddijkstra Dolf Dijkstra added a comment -

        I have created a patch against revision 532277 for this problem. Although it is not according to the RFC it does do the job. For instance IE is doing the same for multi-part mime upload. Not that I am suggesting that IE is doing the right thing, but it does mean that probably many servers can deal with post.

        Index: src/java/org/apache/commons/httpclient/methods/multipart/FilePart.java
        ===================================================================
        — src/java/org/apache/commons/httpclient/methods/multipart/FilePart.java (revision 532277)
        +++ src/java/org/apache/commons/httpclient/methods/multipart/FilePart.java (working copy)
        @@ -193,7 +193,11 @@
        if (filename != null)

        { out.write(FILE_NAME_BYTES); out.write(QUOTE_BYTES); - out.write(EncodingUtil.getAsciiBytes(filename)); + //still not the rigth thing according to RFC1522 + out.write( EncodingUtil.getBytes( filename, this.getCharSet() ) ); + /*TODO: the right thing would be to do this, but some MIMEDecoders can't handle it. + String s = MimeUtility.encodeText(filename); + */ out.write(QUOTE_BYTES); }

        }

        Show
        ddijkstra Dolf Dijkstra added a comment - I have created a patch against revision 532277 for this problem. Although it is not according to the RFC it does do the job. For instance IE is doing the same for multi-part mime upload. Not that I am suggesting that IE is doing the right thing, but it does mean that probably many servers can deal with post. Index: src/java/org/apache/commons/httpclient/methods/multipart/FilePart.java =================================================================== — src/java/org/apache/commons/httpclient/methods/multipart/FilePart.java (revision 532277) +++ src/java/org/apache/commons/httpclient/methods/multipart/FilePart.java (working copy) @@ -193,7 +193,11 @@ if (filename != null) { out.write(FILE_NAME_BYTES); out.write(QUOTE_BYTES); - out.write(EncodingUtil.getAsciiBytes(filename)); + //still not the rigth thing according to RFC1522 + out.write( EncodingUtil.getBytes( filename, this.getCharSet() ) ); + /*TODO: the right thing would be to do this, but some MIMEDecoders can't handle it. + String s = MimeUtility.encodeText(filename); + */ out.write(QUOTE_BYTES); } }
        Hide
        olegk Oleg Kalnichevski added a comment -

        Dolf,
        What is MimeUtility and what package does it come from?

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - Dolf, What is MimeUtility and what package does it come from? Oleg
        Hide
        ddijkstra Dolf Dijkstra added a comment -

        Hi Oleg,

        Thanks for looking into this and sorry for not making clear where MimeUtility originates from.

        MimeUtility is from javax.mail (for instance http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/mail/internet/MimeUtility.html).

        Dolf

        Show
        ddijkstra Dolf Dijkstra added a comment - Hi Oleg, Thanks for looking into this and sorry for not making clear where MimeUtility originates from. MimeUtility is from javax.mail (for instance http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/mail/internet/MimeUtility.html ). Dolf
        Hide
        olegk Oleg Kalnichevski added a comment -

        Dolf,

        We simply cannot not introduce a new dependency for HttpClient 3.x code line. This will have to wait until 4.0

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - Dolf, We simply cannot not introduce a new dependency for HttpClient 3.x code line. This will have to wait until 4.0 Oleg
        Hide
        ddijkstra Dolf Dijkstra added a comment -

        Hi Oleg,

        Maybe the report is not clear.

        According to the mult-part mime spec the correct behaviour would be to use a construct is via the MimeUtility. The problem with that is that the mime-parsers that I have tested with don't handle this correctly.
        When just encoding the filename with the charset of the request, it works but it is not according to the spec.

        The patch I handed in, works on most mime parsers (as IE is doing this too) but is not according to the spec.

        I understand that you don't want to introduce a new dependancy, but maybe you don't need to as the patch works without the MimeUtility. The line containing the MimeUtility is commented.

        Dolf

        Show
        ddijkstra Dolf Dijkstra added a comment - Hi Oleg, Maybe the report is not clear. According to the mult-part mime spec the correct behaviour would be to use a construct is via the MimeUtility. The problem with that is that the mime-parsers that I have tested with don't handle this correctly. When just encoding the filename with the charset of the request, it works but it is not according to the spec. The patch I handed in, works on most mime parsers (as IE is doing this too) but is not according to the spec. I understand that you don't want to introduce a new dependancy, but maybe you don't need to as the patch works without the MimeUtility. The line containing the MimeUtility is commented. Dolf
        Hide
        sebb@apache.org Sebb added a comment -

        Might even be a problem for 4.0 - the license for the JavaMail jar is such that it cannot be distributed by the ASF, as far as I am aware.

        Might be worth checking if Commons-Lang has anything suitable, e.g. in StringEscapeUtils.

        Show
        sebb@apache.org Sebb added a comment - Might even be a problem for 4.0 - the license for the JavaMail jar is such that it cannot be distributed by the ASF, as far as I am aware. Might be worth checking if Commons-Lang has anything suitable, e.g. in StringEscapeUtils.
        Hide
        olegk Oleg Kalnichevski added a comment -

        Dolf,
        My bad. I overlooked that fact that the reference to MimeUtility was inside a comment block.

        Sebastian,
        I believe we can depend on JavaMail, as long as we do not have it in the repository and do not ship it with the release packages. Since we do not bundle dependencies with HttpClient anyways, this should not be a problem for us. Having said all that, I think Commons Codec HttpClient is already dependent upon provides all the necessary codecs (BASE64 and quote-printable). It is just a matter of someone taking up this job.

        Folks,
        Any objections to relaxing the compliance with the spec and applying the patch submitted by Dolf?

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - Dolf, My bad. I overlooked that fact that the reference to MimeUtility was inside a comment block. Sebastian, I believe we can depend on JavaMail, as long as we do not have it in the repository and do not ship it with the release packages. Since we do not bundle dependencies with HttpClient anyways, this should not be a problem for us. Having said all that, I think Commons Codec HttpClient is already dependent upon provides all the necessary codecs (BASE64 and quote-printable). It is just a matter of someone taking up this job. Folks, Any objections to relaxing the compliance with the spec and applying the patch submitted by Dolf? Oleg
        Hide
        asashour Ahmed Ashour added a comment -

        One of HtmlUnit users came across this bug while trying to upload a file with non-ASCII name.

        By sniffing the traffic generated by IE7, "filename" is encoded with page charset as Dolf has kindly suggested.

        However, IE7 does not send the charset after the 'Content-Type':

        ---------------------------
        Content-Disposition: form-data; name="field_name"; filename="C:\non_ascii.txt"
        Content-Type: text/plain
        ---------------------------

        So, to exactly mimic this behaviour, appreciate if part charset is separated from the "Content-Disposition" charset.

        Many thanks.

        Show
        asashour Ahmed Ashour added a comment - One of HtmlUnit users came across this bug while trying to upload a file with non-ASCII name. By sniffing the traffic generated by IE7, "filename" is encoded with page charset as Dolf has kindly suggested. However, IE7 does not send the charset after the 'Content-Type': --------------------------- Content-Disposition: form-data; name="field_name"; filename="C:\non_ascii.txt" Content-Type: text/plain --------------------------- So, to exactly mimic this behaviour, appreciate if part charset is separated from the "Content-Disposition" charset. Many thanks.
        Hide
        oglueck Ortwin Glück added a comment -

        We should be spec compliant and not "compatible with most" implementations - I don't care how wrong IE7 implements this. RFC 2183, Section 2.3 clearly states the limitation to ASCII. People should just accept this limitation instead of trying to bend the standard to their needs. Standards are made to ensure interoperability, for $DIETY's sake. If you need to pass a non-ASCII filename, this is simply not the place for it. You could add another text/plain MIME part with a well-defined charset and pass the file name there for instance.

        Show
        oglueck Ortwin Glück added a comment - We should be spec compliant and not "compatible with most" implementations - I don't care how wrong IE7 implements this. RFC 2183, Section 2.3 clearly states the limitation to ASCII. People should just accept this limitation instead of trying to bend the standard to their needs. Standards are made to ensure interoperability, for $DIETY's sake. If you need to pass a non-ASCII filename, this is simply not the place for it. You could add another text/plain MIME part with a well-defined charset and pass the file name there for instance.
        Hide
        mrezaei Mohammad Rezaei added a comment -

        Ortwin, I think the RFC is worded strangely. It is certainly true that Section 2.3 says US-ASCII only, but it seems like that section is outdated.

        In Section 2, there is a very large note that reads:

        NOTE ON PARAMETER VALUE LENGHTS: A short (length <= 78 characters)
        parameter value containing only non-`tspecials' characters SHOULD be
        represented as a single `token'. A short parameter value containing
        only ASCII characters, but including `tspecials' characters, SHOULD
        be represented as `quoted-string'. Parameter values longer than 78
        characters, or which contain non-ASCII characters, MUST be encoded as
        specified in [RFC 2184].

        Looking at the types of parameters, 4 of them are dates and one is an integer. The only one that's a string is the filename, so the note above must refer to it. RFC 2184 describes how to encode the non-ASCII case. Interestingly, it looks IE does not follow RFC 2184.

        Section 2.3 refers to RFC 2045, which is older than RFC 2184.

        Overall, I'd say the RFC is unclear on this issue.

        Thanks
        Moh

        Show
        mrezaei Mohammad Rezaei added a comment - Ortwin, I think the RFC is worded strangely. It is certainly true that Section 2.3 says US-ASCII only, but it seems like that section is outdated. In Section 2, there is a very large note that reads: NOTE ON PARAMETER VALUE LENGHTS: A short (length <= 78 characters) parameter value containing only non-`tspecials' characters SHOULD be represented as a single `token'. A short parameter value containing only ASCII characters, but including `tspecials' characters, SHOULD be represented as `quoted-string'. Parameter values longer than 78 characters, or which contain non-ASCII characters, MUST be encoded as specified in [RFC 2184] . Looking at the types of parameters, 4 of them are dates and one is an integer. The only one that's a string is the filename, so the note above must refer to it. RFC 2184 describes how to encode the non-ASCII case. Interestingly, it looks IE does not follow RFC 2184. Section 2.3 refers to RFC 2045, which is older than RFC 2184. Overall, I'd say the RFC is unclear on this issue. Thanks Moh
        Hide
        oglueck Ortwin Glück added a comment -

        Interesting, although I have never seen it being used in the wild. By the way, RFC 2184 is obsoleted by RFC 2231.

        Show
        oglueck Ortwin Glück added a comment - Interesting, although I have never seen it being used in the wild. By the way, RFC 2184 is obsoleted by RFC 2231.
        Hide
        olegk Oleg Kalnichevski added a comment -

        MultipartEntity now encodes non-ASCII characters in the disposition-content header using content charset when used in the browser compatibility mode and replaces non-ASCII characters with ? when used in the strict mode. One always has an option to encode the file name using one of the standard encoding mechanisms as described in RFC2231 and RFC2047.

        Closing this issue as resolved.

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - MultipartEntity now encodes non-ASCII characters in the disposition-content header using content charset when used in the browser compatibility mode and replaces non-ASCII characters with ? when used in the strict mode. One always has an option to encode the file name using one of the standard encoding mechanisms as described in RFC2231 and RFC2047. Closing this issue as resolved. Oleg
        Hide
        sermojohn Ioannis Sermetziadis added a comment -

        I believe that the HttpClient should implement RFC2231 by using asterisks to support use of header parameter values in character sets other than US-ASCII, like in the Content-Disposition header.

        So, when a file is uploaded using MultipartEntity, the FormBodyPart should include a Content-Disposition header that follows the specification, in order to correctly encode the file name, in case it uses a character set other than US-ASCII.

        An example of such a header is:
        Content-Disposition=form-data; name=file; filename*=utf-8''test

        If you agree, I could submit a patch on this.

        Show
        sermojohn Ioannis Sermetziadis added a comment - I believe that the HttpClient should implement RFC2231 by using asterisks to support use of header parameter values in character sets other than US-ASCII, like in the Content-Disposition header. So, when a file is uploaded using MultipartEntity, the FormBodyPart should include a Content-Disposition header that follows the specification, in order to correctly encode the file name, in case it uses a character set other than US-ASCII. An example of such a header is: Content-Disposition=form-data; name=file; filename*=utf-8''test If you agree, I could submit a patch on this.
        Show
        reschke Julian Reschke added a comment - Do you have evidence of anybody using RFC 2231 here? See < https://www.greenbytes.de/tech/webdav/rfc7578.html#form-charset > and < https://www.greenbytes.de/tech/webdav/rfc7578.html#rfc.section.4.2.p.5 >.
        Hide
        olegk Oleg Kalnichevski added a comment -

        @Ioannis Sermetziadis Julian Reschke HttpClient presently supports three MIME multipart modes: strict (RFC 822, RFC 2045, RFC 2046), browser compatible, RFC 6532 compatible. Even if RFC 2231 is not known to be widely used if someone is willing to contribute support for it with proper test coverage I see no reason why we should not take it.

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - @ Ioannis Sermetziadis Julian Reschke HttpClient presently supports three MIME multipart modes: strict (RFC 822, RFC 2045, RFC 2046), browser compatible, RFC 6532 compatible. Even if RFC 2231 is not known to be widely used if someone is willing to contribute support for it with proper test coverage I see no reason why we should not take it. Oleg
        Hide
        reschke Julian Reschke added a comment - - edited

        What's the point in implementing something that the applicable spec says "MUST NOT"? As far as I can tell, that spec defines a different approach which is supposed to be what at least some user agents already do.

        (And yes, it's entirely possible that the spec is incorrect, in which case proper tests and reporting the problem to the IETF would be the right answer)

        (Also, RFC 6532 seems to be entirely irrelevant in this context)

        Show
        reschke Julian Reschke added a comment - - edited What's the point in implementing something that the applicable spec says "MUST NOT"? As far as I can tell, that spec defines a different approach which is supposed to be what at least some user agents already do. (And yes, it's entirely possible that the spec is incorrect, in which case proper tests and reporting the problem to the IETF would be the right answer) (Also, RFC 6532 seems to be entirely irrelevant in this context)
        Hide
        olegk Oleg Kalnichevski added a comment -

        @Julian Reschke Julian, from your previous statement I understood RFC 2231 had not been not in widespread use but not that it had been superseded by another spec. What applicable spec are you referring to? I am also fine with dropping RFC 6532 if there is a superseding spec.

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - @ Julian Reschke Julian, from your previous statement I understood RFC 2231 had not been not in widespread use but not that it had been superseded by another spec. What applicable spec are you referring to? I am also fine with dropping RFC 6532 if there is a superseding spec. Oleg
        Hide
        reschke Julian Reschke added a comment -

        Trying to clarify:

        a) AFAIU, RFC 2231 encoding is not used in multipart payloads.

        b) RFC 6532 is irrelevant (being about header fields in email).

        c) The current spec about multipart/form-data is RFC 7578 (<https://www.greenbytes.de/tech/webdav/rfc7578.html>) which I already quoted above (see <https://www.iana.org/assignments/media-types/media-types.xhtml#multipart>).

        Show
        reschke Julian Reschke added a comment - Trying to clarify: a) AFAIU, RFC 2231 encoding is not used in multipart payloads. b) RFC 6532 is irrelevant (being about header fields in email). c) The current spec about multipart/form-data is RFC 7578 (< https://www.greenbytes.de/tech/webdav/rfc7578.html >) which I already quoted above (see < https://www.iana.org/assignments/media-types/media-types.xhtml#multipart >).
        Hide
        olegk Oleg Kalnichevski added a comment - - edited

        @Julian Reschke

        b) RFC 6532 is irrelevant (being about header fields in email).

        I fail to see why this makes it irrelevant but I see no problem dropping RFC 6532 support in favor of RFC 7578 conformant implementation.

        @Ioannis Sermetziadis

        Would you be interested in working on RFC 7578 compliance instead?

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - - edited @ Julian Reschke b) RFC 6532 is irrelevant (being about header fields in email). I fail to see why this makes it irrelevant but I see no problem dropping RFC 6532 support in favor of RFC 7578 conformant implementation. @ Ioannis Sermetziadis Would you be interested in working on RFC 7578 compliance instead? Oleg
        Hide
        sermojohn Ioannis Sermetziadis added a comment -

        I see how RFC 7578 (in section 2 and 4.2) handles the non-ASCII character usage in the multipart part's content-disposition header.

        Both RFC 7578 and RFC 2231 seem to provide a solution to the problem but I do not know which approach is the currently dominant. Based on the big difference in their release date, I would assume that RFC 2231 was in wide use before RFC 7578 was released, so providing support for that would be a benefit. Additionally, supporting RFC 7578 would be valuable, as it seems simple and efficient.

        Sure, I would be interested to work on both or one of the options. Not sure, however, how the implementation might conflict with the existing HttpMultipartModes. For example, it is not clear to me which specifications the browser_compatible mode follows. Also, should the HttpClient be backwards compatible with the currently defined modes?

        Show
        sermojohn Ioannis Sermetziadis added a comment - I see how RFC 7578 (in section 2 and 4.2) handles the non-ASCII character usage in the multipart part's content-disposition header. Both RFC 7578 and RFC 2231 seem to provide a solution to the problem but I do not know which approach is the currently dominant. Based on the big difference in their release date, I would assume that RFC 2231 was in wide use before RFC 7578 was released, so providing support for that would be a benefit. Additionally, supporting RFC 7578 would be valuable, as it seems simple and efficient. Sure, I would be interested to work on both or one of the options. Not sure, however, how the implementation might conflict with the existing HttpMultipartModes. For example, it is not clear to me which specifications the browser_compatible mode follows. Also, should the HttpClient be backwards compatible with the currently defined modes?
        Hide
        olegk Oleg Kalnichevski added a comment -

        For example, it is not clear to me which specifications the browser_compatible mode follows.

        There is no specification to speak of. It just represents an attempt at simulating the behavior of commons browsers.

        Also, should the HttpClient be backwards compatible with the currently defined modes

        Depends on what branch you decide to contribute to. It is certainly the case for 4.5.x but we can be more flexible in 5.x.

        Oleg

        Show
        olegk Oleg Kalnichevski added a comment - For example, it is not clear to me which specifications the browser_compatible mode follows. There is no specification to speak of. It just represents an attempt at simulating the behavior of commons browsers. Also, should the HttpClient be backwards compatible with the currently defined modes Depends on what branch you decide to contribute to. It is certainly the case for 4.5.x but we can be more flexible in 5.x. Oleg

          People

          • Assignee:
            Unassigned
            Reporter:
            ewrickspm@yahoo.com Eric Dofonsou
          • Votes:
            3 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development