Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.5.0, 2.6.0
-
None
Description
java.lang.RuntimeException: request: GET https://cloudsync-performance-tests.s3.amazonaws.com/?delimiter=/&prefix=some/&max-keys=1000 HTTP/1.1; response: HTTP/1.1 200 OK; cause: java.lang.RuntimeException: request: GET https://cloudsync-performance-tests.s3.amazonaws.com/?delimiter=/&prefix=some/&max-keys=1000 HTTP/1.1; error at 586:2 in document ; cause: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 586; Character reference "" is an invalid XML character. at org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:174) at org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:146) at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:86) at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:52) at org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91) at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74) at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45) at org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156) at org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123) at jdk.proxy2/jdk.proxy2.$Proxy235.listBucket(Unknown Source) at org.jclouds.s3.blobstore.S3BlobStore.list(S3BlobStore.java:177)
When there's a control character in the folder path in S3, we can't parse it from the response because it throws SAXParseException.
Can there be an option that at least lets us forward the encoding-type param?
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html#API_ListObjects_RequestSyntax
And url decode it for us so that listing can be possible? This bug currently doesn't allow us to list any children of a root folder if one of the children contains control characters.
Here's an example XML response from S3 when listing objects from cURL:
<?xml version="1.0" encoding="UTF-8"?> <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test/</Prefix></CommonPrefixes></ListBucketResult>
Child folder of 'some' contains
<Prefix>some/test/</Prefix>
which can't be parsed.
But with the urlParam &encoding-type=url :
<?xml version="1.0" encoding="UTF-8"?> <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test%10/</Prefix></CommonPrefixes></ListBucketResult>
<Prefix>some/test%10/</Prefix>
Can probably be parsed.