Description
A goal of this code is "support O(1) commits to S3 repositories in the presence of failures". Implement it, including whatever is needed to demonstrate the correctness of the algorithm. (that is, assuming that s3guard provides a consistent view of the presence/absence of blobs, show that we can commit directly).
I consider ourselves free to expose the blobstore-ness of the s3 output streams (ie. not visible until the close()), if we need to use that to allow us to abort commit operations.
Attachments
Attachments
Issue Links
- breaks
-
HADOOP-15631 Remove transient dependency on hadoop-hdfs-client
- Open
-
MAPREDUCE-7014 Fix java doc errors in jdk1.8
- Resolved
- contains
-
HADOOP-14714 handle InternalError in bulk object delete through retries
- Resolved
-
HADOOP-14717 Add StreamCapabilities support to s3a
- Resolved
-
HADOOP-14423 s3guard will set file length to -1 on a putObjectDirect(stream, -1) call
- Resolved
-
HADOOP-15228 S3A Retry policy to retry on NoResponseException
- Resolved
-
MAPREDUCE-6961 Pull up FileOutputCommitter.getOutputPath to PathOutputCommitter
- Resolved
- depends upon
-
HADOOP-13449 S3Guard: Implement DynamoDBMetadataStore.
- Resolved
-
MAPREDUCE-6823 FileOutputFormat to support configurable PathOutputCommitter factory
- Resolved
-
MAPREDUCE-6956 FileOutputCommitter to gain abstract superclass PathOutputCommitter
- Resolved
- duplicates
-
HADOOP-13205 S3A to support custom retry policies; failfast on unknown host
- Resolved
-
HADOOP-13811 s3a: getFileStatus fails with com.amazonaws.AmazonClientException: Failed to sanitize XML document destined for handler class
- Resolved
-
HADOOP-14303 Review retry logic on all S3 SDK calls, implement where needed
- Resolved
-
HADOOP-14381 S3AUtils.translateException to map 503 reponse to => throttling failure
- Resolved
-
MAPREDUCE-6823 FileOutputFormat to support configurable PathOutputCommitter factory
- Resolved
- incorporates
-
HADOOP-13205 S3A to support custom retry policies; failfast on unknown host
- Resolved
-
HADOOP-13811 s3a: getFileStatus fails with com.amazonaws.AmazonClientException: Failed to sanitize XML document destined for handler class
- Resolved
-
HADOOP-13967 S3ABlockOutputStream to support plugin point for different multipart strategies
- Resolved
-
HADOOP-13968 S3a FS to support "__magic" path for the special "unmaterialized" writes
- Resolved
-
HADOOP-13969 S3A to support commit(path) operation, which commits all pending put commits in a path
- Resolved
-
HADOOP-14303 Review retry logic on all S3 SDK calls, implement where needed
- Resolved
-
HADOOP-14381 S3AUtils.translateException to map 503 reponse to => throttling failure
- Resolved
-
MAPREDUCE-6823 FileOutputFormat to support configurable PathOutputCommitter factory
- Resolved
-
HADOOP-14859 Shaded AWS library stops s3a recognising ConnectTimeoutException
- Resolved
- is depended upon by
-
HADOOP-13761 S3Guard: implement retries for DDB failures and throttling; translate exceptions
- Resolved
-
HADOOP-14831 Ãœber-jira: S3a phase IV: Hadoop 3.1 features
- Resolved
- is duplicated by
-
HADOOP-14971 Merge S3A committers into trunk
- Resolved
-
HADOOP-15003 Merge S3A committers into trunk: Yetus patch checker
- Resolved
-
HADOOP-13574 Unnecessary file existence check causes problems with S3
- Resolved
-
HADOOP-13912 S3a Multipart Committer (avoid rename)
- Resolved
-
HADOOP-15087 S3A to support writing directly to the destination dir without creating temp directory to avoid rename
- Resolved
- is related to
-
HDFS-13713 Add specification of Multipart Upload API to FS specification, with contract tests
- Resolved
-
HADOOP-15079 ITestS3AFileOperationCost#testFakeDirectoryDeletion failing after OutputCommitter patch
- Resolved
-
HADOOP-14303 Review retry logic on all S3 SDK calls, implement where needed
- Resolved
-
HADOOP-14584 WASB to support high-performance commit protocol
- Resolved
-
HADOOP-15890 Some S3A committer tests don't match ITest* pattern; don't run in maven
- Resolved
-
HBASE-20431 Store commit transaction for filesystems that do not support an atomic rename
- Closed
-
SPARK-10063 Remove DirectParquetOutputCommitter
- Resolved
-
SPARK-18883 FileNotFoundException on _temporary directory
- Resolved
-
HADOOP-14161 Failed to rename file in S3A during FileOutputFormat commitTask
- Resolved
-
HADOOP-13912 S3a Multipart Committer (avoid rename)
- Resolved
-
HADOOP-18600 Hadoop 2.x should support s3a committers
- Resolved
-
MAPREDUCE-6974 Add standard configuration keys for HTrace values, propagate across to MR committers if set
- Resolved
-
MAPREDUCE-7060 Cherry Pick PathOutputCommitter class/factory to branch-3.0
- Resolved
- relates to
-
HADOOP-13846 S3A to implement rename(final Path src, final Path dst, final Rename... options)
- Open
-
HIVE-16295 Add support for using Hadoop's S3A OutputCommitter
- Patch Available
-
SPARK-18512 FileNotFoundException on _temporary directory with Spark Streaming 2.0.1 and S3A
- Resolved
-
SPARK-22217 ParquetFileFormat to support arbitrary OutputCommitters
- Resolved
- supercedes
-
HADOOP-14161 Failed to rename file in S3A during FileOutputFormat commitTask
- Resolved