Description
In the patch for HADOOP-10400, calls to AmazonS3Client.deleteObjects() need to have the number of entries at 1000 or below. Otherwise we get a Malformed XML error similar to:
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 6626AD56A3C76F5B, AWS Error Code: MalformedXML, AWS Error Message: The XML you provided was not well-formed or did not validate against our published schema, S3 Extended Request ID: DOt6C+Y84mGSoDuaQTCo33893VaoKGEVC3y1k2zFIQRm+AJkFH2mTyrDgnykSL+v
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3480)
at com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:1739)
at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:388)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:829)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:874)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:878)
Note that this is mentioned in the AWS documentation:
http://docs.aws.amazon.com/AmazonS3/latest/API/multiobjectdeleteapi.html
"The Multi-Object Delete request contains a list of up to 1000 keys that you want to delete. In the XML, you provide the object key names, and optionally, version IDs if you want to delete a specific version of the object from a versioning-enabled bucket. For each key, Amazon S3….”
Thanks to Matteo Bertozzi and Rahul Bhartia from AWS for identifying the problem.
Attachments
Attachments
Issue Links
- breaks
-
HADOOP-11670 Regression: s3a auth setup broken
- Closed
-
HADOOP-11394 hadoop-aws documentation missing.
- Closed
- is depended upon by
-
HADOOP-11571 Über-jira: S3a stabilisation phase I
- Closed
-
HADOOP-10400 Incorporate new S3A FileSystem implementation
- Closed
- is related to
-
HADOOP-13402 S3A should allow renaming to a pre-existing destination directory to move the source path under that directory, similar to HDFS.
- Resolved
- relates to
-
HADOOP-11128 abstracting out the scale tests for FileSystem Contract tests
- Open